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Abstract 

Two  high  order  vector  filters  (HOFs)  axe  developed  for  estimation  in  non- 
Gaussian  noise.  These  filters  are  constructed  using  nonlinear  functions  of  the  in¬ 
novations  process.  They  are  completely  general  in  that  the  initial  state  covariance, 
the  measurement  noise  covariance,  and  the  process  noise  covariance  can  all  have 
non-Gaussian  distributions.  The  first  filter  is  designed  for  systems  with  asymmetric 
probability  densities.  The  second  is  designed  for  systems  with  symmetric  probabil¬ 
ity  densities.  Experimental  evaluation  for  estimation  in  non-Gaussian  noise,  formed 
from  Gaussian  sum  distributions,  shows  that  these  filters  perform  much  better  than 
the  standard  Kalman  filter,  and  close  to  the  optimal  Bayesian  estimator. 

The  problem  of  high  resolution  parameter  estimation  of  superimposed  sinu¬ 
soids  is  addressed  using  nonlinear  filtering  techniques.  Six  separate  nonlinear  filters 
are  evaluated  for  the  estimation  of  the  parameters  of  sinusoids  in  white  and  colored 
Gaussian  noise.  Experimental  evaluation  demonstrates  that  the  nonlinear  filters 
perform  close  to  the  Cramer-Rao  bound  for  reasonable  values  of  the  initial  estima¬ 
tion  error.  The  recursive  technique  developed  here  is  well  suited  for  time-varying 
systems  and  for  measurements  with  short  data  lengths. 

A  general  approach  to  model  order  selection  is  presented  based  on  joint  de¬ 
tection/estimation  theory.  The  approach  involves  the  simultaneous  application  of 
maximum  a  posteriori  (MAP)  detection  and  nonlinear  estimation  using  either  the 
extended  Kalman  filter  when  the  noise  is  Gaussian,  or  the  extended  high  order  filter 
(EHOF)  when  the  noise  is  in  non-Gaussian.  The  problem  is  formulated  as  a  multi¬ 
ple  hypothesis  testing  problem  with  assumed  known  a  priori  probabilities  for  each 
hypothesis.  Experimental  evaluation  of  the  approach  demonstrates  excellent  perfor- 
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mance  in  selecting  the  correct  model  order  and  estimating  the  system  parameters 
for  SNR’s  as  low  as  -5  dB. 

A  nonlinear  adaptive  detector/estimator  (NADE)  is  introduced  for  single  and 
multiple  sensor  data  processing.  The  problem  of  target  detection  from  returns  of 
monostatic  sensor(s)  is  formulated  as  a  nonlinear  joint  detection/estimation  problem 
on  the  unknown  parameters  in  the  signal  return.  The  unknown  parameters  involve 
the  presence  of  the  target,  its  range,  azimuth,  and  Doppler  velocity.  The  problems 
of  detecting  the  target  and  estimating  its  parameters  are  considered  jointly.  A 
bank  of  spatially  and  temporally  localized  nonlinear  filters  is  used  to  estimate  the  a 
posteriori  likelihood  of  the  existence  of  the  target  in  a  given  space-time  resolution 
cell.  Within  a  given  cell,  the  localized  filters  are  used  to  produce  refined  spatial 
estimates  of  the  target  parameters.  Excellent  performance  is  obtained  using  this 
technique  for  single  sensor  processing  and  for  centralized  data  fusion. 
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Chapter  1 

Introduction 

In  this  thesis  a  new  high  order  filter  (HOF)  is  developed  for  estimation  in  non- 
Gaussian  noise.  It  is  shown  that  this  new  filter  yields  improved  performance  over 
the  standard  linear  Kalman  filter  and  is  less  computationally  intensive  than  optimal 
non-Gaussian  filtering  techniques  such  as  Gaussian  sum  filters.  This  thesis  also 
addresses  parameter  estimation  in  the  context  of  several  signal  processing  problems. 
These  problems,  which  are  formulated  as  nonlinear  estimation  problems,  have  been 
traditionally  addressed  using  other  parametric  and  nonparametric  techniques.  It 
is  shown  that  nonlinear  filtering  techniques,  including  the  nonlinear  version  of  the 
HOF,  designated  the  extended  high  order  filter  (EHOF),  can  perform  very  well  for 
estimation  of  signal  parameters  in  Gaussian  and  non-Gaussian  noise. 

1.1  Motivation  for  the  Study 

The  standard  Kalman  filter  does  not  use  the  higher  moments  of  the  density 
functions  and  therefore  cannot  adequately  deal  with  non-Gaussian  distributions. 
Many  of  the  existing  techniques  for  estimation  in  the  presence  of  non-Gaussian  noise 
require  accurate  knowledge  of  the  density  functions.  Given  this  knowledge,  they  at¬ 
tempt  to  approximate  these  functions  using  Gaussian  sums  or  other  approximations 
address  the  problem  of  nonlinear  estimation  in  non-Gaussian  noise.  Other  methods 
make  simplifying  assumptions  such  as  symmetrical  distributions,  small  plant  noise, 
or  small  measurement  noise  in  order  to  develop  approximate  filters.  This  motivated 
a  study  of  the  filtering  problem  from  a  more  general  point  of  view.  The  goal  of 
this  study  is  to  develop  filtering  algorithms  for  systems  in  non-Gaussian  noise  that 
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use  knowledge  of  the  moments  of  the  a  priori  distributions.  In  these  algorithms  no 
assumptions  are  made  about  the  power  of  the  noise  or  the  shape  of  the  probability 
density  function. 

Several  specific  problems  in  the  signal  processing  area  are  of  interest  in  the 
application  of  nonlinear  parameter  estimation  techniques  in  the  presence  of  Gaussian 
and  non-Gaussian  noise.  These  problems  are  also  associated  with  estimating  the 
parameters  of  sinusoids. 

A  problem  that  has  attracted  a  large  amount  of  research  is  that  of  harmonic 
retrieval.  This  problem  consists  of  estimating  some  or  all  of  the  frequencies,  am¬ 
plitudes,  damping  coefficients,  and  phases  of  superimposed  sinusoids  in  white  or 
colored,  Gaussian  or  non-Gaussian  noise.  Much  of  the  work  in  the  area  of  high 
resolution  spectral  estimation  or  harmonic  retrieval  has  been  based  on  fitting  an 
autoregressive  (AR)  or  autoregressive  moving  average  (ARMA)  model  to  the  re¬ 
ceived  data.  However,  the  performance  of  most  modem  high  resolution  estimation 
techniques  is  severely  degraded  at  low  SNR’s  and/or  short  data  lengths.  This  is 
probably  due  to  the  fact  that  these  techniques  are  heuristic  least  squares  modifi¬ 
cations  of  algorithms  that  yield  exact  results  when  there  is  no  noise  or  when  the 
available  data  is  infinite.  Quite  often  the  initial  conditions  on  a  problem  can  be 
bounded  so  that  fairly  accurate  a  priori  estimates  can  be  obtained.  The  harmonic 
retrieval  problem  is  successfully  addressed  in  this  thesis  with  nonlinear  estimation 
techniques. 

A  separate  but  related  problem  is  that  of  model  order  selection.  The  objec¬ 
tive  in  model  order  selection  is  to  determine  the  number  of  sinusoids  embedded  in 
Gaussian  and  non-Gaussian  noise.  This  problem  is  approached  in  this  thesis  with 
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joint  detection/estimation  techniques. 

The  joint  detection/estimation  (JD/E)  procedure  is  presented  in  Chapter 
5.  The  procedure  is  structured  mathematically  so  that  it  can  be  employed  against 
problems  with  model  uncertainty,  initial  condition  uncertainty,  or  both.  The  JD/E 
technique  can  be  applied  to  any  type  of  noise,  assuming  the  density  function  is 
known.  This  technique  is  applied  in  subsequent  chapters  for  selected  sinusoidal 
detection  and  parameter  estimation  problems. 

Joint  detection/estimation  techniques  can  also  be  applied  to  the  estimation 
of  Doppler  shift  and  time  delay  from  an  echo  of  a  transmitted  signal.  Traditional 
solutions  for  this  problem  are  based  on  Fourier  transform  implementations  and  gen¬ 
erally  have  poor  resolution  in  the  presence  of  short  data  lengths.  It  is  shown  how 
estimates  from  multiple  sensors  can  be  combined  to  form  improved  estimates  of 
target  range,  geometric  angle,  and  velocity. 

1.2  Scope  of  the  Thesis 

Chapter  2  discusses  the  fundamentals  of  estimation  theory  and  presents  the 
primary  techniques  currently  used  to  perform  nonlinear  estimation  in  Gaussian 
noise,  and  linear  estimation  in  non-Gaussian  noise.  This  chapter  is  essentially  com¬ 
posed  of  background  material  that  is  needed  for  an  understanding  of  the  remainder 
of  the  thesis. 

Chapter  3  presents  a  general  solution  to  the  problem  of  estimation  in  the 
presence  of  non-Gaussian  noise.  The  solution  is  based  on  high  order  powers  of 
the  innovations  process.  The  solution  is  entirely  general  in  that  the  plant  noise, 
the  measurement  noise,  or  the  initial  estimation  error  can  be  non-Gaussian  with 
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symmetrical  or  asymmetrical  distributions.  The  performance  of  the  filter  for  non- 
Gaussian  noise  is  compared  to  exact  Bayesian  filters.  Non-Gaussian  distributions  are 
created  using  a  sum  of  Gaussian  distributions.  Bayesian  filters  can  be  constructed 
to  give  optimal  performance  for  Gaussian  sum  distributions.  The  intent  of  this 
comparison  is  to  numerically  evaluate  the  performance  of  the  non-Gaussian  filters 
and  to  determine  where  these  filters  provide  improvement  in  state  estimation  over 
the  standard  Kalman  filter.  It  is  shown  that  the  high  order  filter  (HOF)  performs 
better  than  the  standard  Kalman  filter,  but  not  quite  as  well  as  the  optimal  Gaussian 
sum  filter. 

Chapter  4  shows  that  nonlinear  filtering  techniques  can  be  used  for  high  res¬ 
olution  harmonic  retrieval.  Traditional  approaches  in  this  area  have  been  concerned 
with  Fourier  transforms  or  techniques  based  on  autorecursive  (AR)  or  autorecursive 
moving  average  (ARMA)  estimation.  Many  of  these  approaches  are  batch  estima¬ 
tors  and,  as  such,  cannot  adequately  deal  with  time  varying  systems.  In  addition, 
most  of  these  techniques  cannot  take  advantage  of  a  priori  estimates  of  the  initial 
system  state.  It  is  shown  that  nonlinear  filtering  methods  can  give  highly  accurate 
estimates  (approaching  the  CR  bound)  of  the  parameters  of  sinusoids  in  white  and 
colored  Gaussian  noise.  A  particularly  attractive  filter  to  use  in  the  harmonic  re¬ 
trieval  problem  is  the  minimum  variance  filter.  This  filter  requires  exact  expressions 
for  expected  values  of  nonlinear  functions  of  the  state  variables  during  each  itera¬ 
tion  of  the  filter  equations.  Closed  form  expressions  for  these  expected  values  are 
developed  for  the  specific  nonlinear  functions  used  in  the  harmonic  retrieval  prob¬ 
lem.  Using  these  expressions  it  is  expected  that  the  minimum  variance  filter  should 
give  better  state  estimates  than  the  extended  Kalman  filter  (EKF)  especially  when 
there  are  large  errors  in  the  initial  estimates.  In  this  chapter  Monte  Carlo  simu- 
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lations  are  used  to  compare  the  performance  of  several  nonlinear  filters  to  the  CR 
bound.  Studies  are  performed  to  determine  the  effect  of  poor  initial  conditions  on 
the  performance  of  these  nonlinear  filters. 

The  joint  detection/estimation  (JD/E)  procedure  is  presented  in  Chapter 
5.  The  procedure  is  structured  mathematically  so  that  it  can  be  employed  against 
problems  with  model  uncertainty,  initial  condition  uncertainty,  or  both.  The  JD/E 
technique  can  be  applied  to  any  type  of  noise,  assuming  the  density  function  is 
known.  This  technique  is  applied  in  subsequent  chapters  for  selected  sinusoidal 
detection  and  parameter  estimation  problems. 

The  JD/E  technique  is  used  in  Chapter  6  to  perform  model  order  selection. 
A  general  approach  is  presented  for  determining  the  number  of  sinusoids  present 
in  measurements  corrupted  by  additive  white  Gaussian  and  non-G aussian  noise. 
Experimental  evaluation  of  this  approach  demonstrates  excellent  performance  for 
model  order  selection  and  system  parameter  estimation  in  both  Gaussian  and  non- 
Gaussian  noise. 

Chapter  7  uses  the  JD/E  approach  to  estimate  time  delay  and  Doppler  shift 
from  echos  of  a  transmitted  waveform.  The  problem  of  target  detection  from  returns 
of  monostatic  sensor  (s)  is  formulated  as  a  nonlinear  joint  detection/estimation  prob¬ 
lem  on  the  unknown  parameters  in  the  signal  return.  In  this  chapter  it  is  assumed 
that  the  target  has  been  detected.  The  JD/E  procedure  is  applied  by  segmenting 
a  large  initial  estimation  error  into  smaller  regions  of  uncertainty  and  operating  an 
independent  nonlinear  filter  to  perform  parameter  estimation  for  each  of  these  re¬ 
gions.  It  is  found  that  this  approach  can  help  solve  the  problem  of  convergence  to 
local  minima,  which  is  characteristic  of  estimators  such  as  the  EKF. 
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In  Chapter  8,  the  problems  of  detecting  the  target  and  estimating  its  param¬ 
eters  are  considered  jointly.  The  fusion  of  parameter  estimates  from  two  spatially 
separated  sensors  is  accomplished  using  the  JD/E  approach.  Several  hypotheses  are 
postulated  for  detection.  Each  hypothesis  corresponds  to  the  ability  of  each  sensor 
to  detect  the  target  in  its  area  of  coverage.  The  a  priori  probabilities  of  each  decision 
are  based  on  the  area  of  coverage  of  the  two  sensors.  For  each  hypothesis,  a  nonlinear 
filter  recursively  estimates  target  parameters.  The  maximum  likelihood  estimate  for 
a  given  hypothesis  is  then  determined  as  a  weighted  sum  of  the  estimates  from  each 
of  the  local  hypotheses,  with  the  a  posteriori  probability  being  used  as  the  weighting 
function.  It  is  shown  experimentally  that  excellent  performance  can  be  obtained  for 
both  target  detection  and  target  parameter  estimation  using  this  technique. 
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Chapter  2 

Optimal  and  Suboptimal  Estimation 

The  purpose  of  this  chapter  is  to  briefly  cover  the  fundamentals  of  estimation 
theory  and  to  discuss  several  techniques  for  nonlinear  estimation  found  in  existing 
literature.  Section  2.1  presents  the  basic  concepts  of  estimation  theory  and  some 
of  the  properties  of  estimators.  Section  2.2  presents  optimal  Bayesian  estimation. 
This  section  also  presents  the  derivation  of  the  linear  Kalman  filter,  which  is  the 
optimal  estimator  for  linear  systems  in  additive  white  Gaussian  noise.  In  Section 
2.3  Bayesian  approximations  are  discussed.  These  approximations  entail  methods 
for  estimation  of  the  a  posteriori  density  function.  Section  2.4  discusses  nonlinear 
filtering  techniques  for  nonlinear  systems  in  additive  Gaussian  noise.  Section  2.5 
presents  techniques  for  linear  filtering  in  non-Gaussian  noise.  The  overall  goal  of 
this  chapter  is  to  explain  the  basics  of  estimation  theory  and  to  show  the  evolution 
of  the  optimal  linear  estimator,  the  Kalman  Filter,  into  techniques  for  nonlinear 
estimation.  This  will  lay  the  groundwork  for  further  discussions  on  new  techniques 
presented  in  this  thesis  for  suboptimal  estimation  in  non-Gaussian  noise  and  for 
applications  of  nonlinear  filtering  to  specific  signal  processing  problems.  In  this 
thesis  only  discrete  time  (i.e.  sampled  data)  estimation  problems  are  addressed. 

2.1  Fundamentals  of  Estimation  Theory 

Estimation  theory  addresses  the  process  of  determining  the  value  of  some 
uncertain  quantity  based  on  available  pertinent  information.  Consider  the  problem 
of  estimating  the  n-dimensional  time  invariant  parameter  vector  x  from  observations 
represented  by  the  m-dimensional  vector  z*.  The  measurements  are  described  by 


8 


the  nonlinear  relation 

zk  =  h(x,k,v*). 

where  v*  is  random  noise.  The  estimate  x  is  given  by 

x  =  e*(Z*) 

where  Z*  is  the  set  of  all  measurements  (zi,Z2,  •  •  *  ,  z*).  The  function  e*  is  called 
the  estimator  of  x.  There  are  two  basic  models  for  the  parameter  x: 

(1)  Nonrandom,  when  x  has  an  unknown  deterministic  value. 

(2)  Random ,  when  the  parameter  x  has  a  priori  probability  density  function 
(PDF)  p(x). 

For  nonrandom  parameters  it  is  desired  that  the  estimates  converge  to  the 
true  value  as  k  — ►  oo.  For  random  time  invariant  parameters,  a  realization  of  x 
is  drawn  from  a  population  with  the  assumed  PDF.  One  would  like  each  measure¬ 
ment  to  yield  an  estimate  that  converges  in  some  well-defined  probabilistic  sense 
independent  of  the  particular  realization  of  x. 

Optimal  estimation  defines  the  best  estimate  of  a  parameter  based  on  some 
well-chosen  criteria  of  optimality.  Since  different  criteria  may  lead  to  different  op¬ 
timal  estimates  for  the  same  quantity,  one  may  settle  for  feasible  or  acceptable 
estimates  according  the  following  rules  [1]: 

(1)  An  estimate  x  is  unbiased  if  it  satisfies  the  relation 


m  =  e[x) 
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(2)  An  estimate  x  is  a  consistent  estimate  if  it  converges  in  probability  to  x, 
i.e. 

lim  prob[||x  — x||  >  e]  =  0  for  arbitrarily  small  e 

k—OC 

A  consistent  estimate  is  always  unbiased. 

(3)  An  efficient  estimate  x  is  the  unbiased  estimate  of  x  with  the  minimum 
variance,  i.e., 

a\  =  £[||x  -  x||]2  <  E[\\y  -  x||2]  =  <r? 
for  all  other  estimates  y  of  x. 

(4)  An  estimate  x  is  called  sufficient  if  it  contains  all  of  the  information  in 
the  set  of  observed  values  regarding  the  parameter  x  to  be  evaluated.  Any  statistic 
related  to  a  sufficient  estimate  is  called  a  sufficient  statistic. 

Several  estimation  techniques  have  been  used  for  the  estimation  of  random 
parameters.  Many  of  these  techniques  are  derived  from  or  related  to  Bayesian 
estimation. 

The  maximum  a  posteriori  (MAP)  estimate  is  obtained  by  maximizing  the 
conditional  density 

with  respect  to  the  unknown  parameter  vector  x.  Since  p(z)  is  not  a  function  of  x, 
the  MAP  estimate  may  be  obtained  by  maximizing  the  joint  density 

p(z|x)p(x)  =  p(z,x) 

with  respect  to  x.  This  can  be  accomplished  by  maximizing  the  natural  logarithm 
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of  this  quantity  so  that  the  MAP  estimate  can  be  expressed  by 

din  p(z,x)  _  dlnp(z|x)  dlnp(x)  _Q 

x=x(z)  x=x(z)  x=x(z) 

In  the  case  where  p(x)  is  unknown  the  best  choice  of  x  is  made  based  on  maximizing 
the  likelihood  function  p(z|x).  The  maximum  likelihood  (ML)  estimate  is  given  by 

d\np(z\x)  =Q 

x=x(z) 

It  is  clear  that  the  ML  estimate  is  inferior  to  the  MAP  estimate  since  it  does  not 
consider  prior  information  about  the  random  vector  x.  However,  the  ML  estimate 
may  be  useful  in  situations  where:  (1)  the  parameter  x  is  unknown  but  not  random, 
(2)  the  a  priori  density  of  x  is  unknown,  or  (3)  the  density  functions  p(x|z)  or  p(x,  z) 
are  more  difficult  to  compute  than  p(z|x). 

Consider  the  problem  of  estimating  a  nonrandom  parameter  vector  x  from 
a  single  linear  measurement  of  this  vector  in  Gaussian  noise.  In  this  case  the  mea¬ 
surement  model  is  given  by 

z  =  Hx  4-  v 

where  v  ~  N( 0,  R).  The  likelihood  function  is  given  by 

P(z|x)  = - 1 - i-exp(-i(z  -  Hx))rf2-1(z  -  Hx)), 

(2TT)Z|fl|7  1 

and  the  maximum  likelihood  estimate  of  x  is  the  root  of  the  equation 


— 2(z  —  Hx)t  RT1  H  =  0 
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leading  to  the  ML  estimate 

x  =  ( Ht  R-1  H)~'Ht  R-1  z. 

Least  squares  estimates  are  obtained  by  minimizing  the  sum  of  the  squared 
error  between  the  measurements  and  the  measurement  model.  It  can  be  shown 
[2]  that  if  the  noises  are  independent,  identically  distributed  (i.i.d.),  zero-mean, 
Gaussian  random  variables  the  least  squares  estimate  is  the  same  as  the  ML  estimate. 

The  minimum  mean  square  error  (MMSE)  estimate,  or  minimum  variance 
estimate,  is  obtained  by  minimizing  the  expected  value  of  the  mean  square  error 
E[(x  —  x)2|Zj]  of  the  estimate  based  on  the  data  up  to  and  including  time  k. 
The  solution  is  the  conditional  mean  expressed  in  terms  of  the  conditional  PDF 
x  =b  /  xp(x|Z*)dx. 

2.2  Optimal  Bayesian  Estimation 

An  optimal  estimate  is  defined  as  the  minimum  variance  estimate  or  the  mean 
of  the  conditional  density  function.  It  will  be  shown  in  this  section  that  recursion 
relations  can  be  set  up  to  determine  the  conditional  density  based  on  Bayes’  rule. 

Consider  the  problem  of  estimating  a  time  varying  n-dimensional  state  vector 
x*,  where  the  state  evolves  according  to  the  plant  equation 

xfc+1  =  f(x*,wt).  (2.1) 

The  state  x*  is  observed  through  the  m-dimensional  measurement  vector  z*  given 
by 


=  h(xt,vt) 


(2.2) 
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where  w*  and  v*  are  mutually  independent  white  noise  sequences.  The  problem 
is  to  estimate  the  state  x*  from  the  measurements  Z*,  where  Z*  is  the  set  of  all 
measurements  (zi ,  Z2,  •  •  • ,  zjt).  The  objective  of  Bayesian  parameter  estimation  is  to 
recursively  calculate  the  a  posteriori  density  of  the  state.  This  density,  also  referred 
to  as  the  filtering  density,  can  be  obtained  [3]  through  the  recursion  relations 


M)  =  (2-3) 

p(xt|Zt_i)  =  J  p(*t-i |Z*-i (2.4) 

where 

p(zt|Zt-i)  =  J  p(xt|Zt_1)p(zfc|xfc)dxfc.  (2.5) 


The  initial  density  p(xo|zo)  is  given  by 

(2.6) 

The  density  p(zjtjxfc)  in  equation  (2.3)  can  be  determined  by  the  a  priori 
measurement  noise  density  p(v*)  and  the  measurement  equation  (2.2).  Likewise, 
p(xjt|x*_i)  in  (2.4)  is  determined  from  p(w*_j)  and  equation  (2.1).  Knowledge  of 
these  densities  and  p(xo)  determines  p(x*|Z*)  for  all  k.  However,  the  major  difficulty 
with  recursive  Bayesian  estimation  is  the  closed  form  solution  of  the  integration  in 
(2.4).  This  integral  can  be  solved  only  for  linear  state  and  measurement  equations 
with  Gaussian  statistics,  and  a  limited  set  of  nonlinear  systems. 


The  advantage  of  using  Bayesian  estimation  is  that  once  the  a  posteriori 
density  is  obtained  one  can  compute  estimates  based  on  any  estimation  criteria. 
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For  example,  the  most  probable  estimate  is  found  by  maximizing  the  probability 
that  (Xjt|jt  =  x),  yielding  the  solution  x k\k  =  Mode  {p(x*|Z*)}.  When  the  a  priori 
density  is  uniform,  this  estimate  is  identical  to  the  ML  estimate.  If  the  criteria 
is  to  minimize  /  ||xjt  —  x||2p(xt|Zt),  the  solution  is  x =  J?[xjtjZjk].  This  is  the 
conditional  mean  estimate.  If  the  criteria  is  to  minimize  the  maximum  of  |xjt — Xt|t|, 
the  solution  is  the  minimax  estimate  defined  by  x*|i  =  Median{p(xt|Zjt)}. 


In  the  case  of  linear  systems  in  Gaussian  noise,  equations  (2.3  -  2.6)  can  be 
evaluated  and  the  a  posteriori  density  is  Gaussian  for  all  k.  The  conditional  mean 
and  covariances  for  this  system  are  the  Kalman  filter  equations,  which  were  first 
introduced  by  R.  E.  Kalman  [4).  In  the  development  to  follow  the  Kalman  filter 
relations  are  derived  from  the  Bayesian  recursion  formulas.  This  derivation  is  based 
on  a  similar  development  by  Ho  and  Lee  [5].  The  linear  plant  and  measurement 
models  have  the  form 

X*  =  1  X*_!  +  rjt-lWfc.j 


zk  =  Hkxk  -V  vt 

where  w*  and  v*  are  independent,  white,  Gaussian  sequences  with 
£[vjt]  =  -E(wk]  =  0  Vfc 

£[v*vf]  =  Rkhj\  ^[wfcwj']  =  Qkh}',  E[vkwJ}  =  0  V j,k 
Starting  with  the  initial  conditions  that  p(xo|zo)  is  Gaussian  and 

£[(x0|zo)  =  Xo|0 
C<w{xo|z0]  =  Polo- 


(2.7) 


(2.8) 


From  (2.7)  it  is  noted  that  p(xt|Z*_i)  is  Gaussian  and  independent  of  v*  so 
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that 

*Jb|Jfc-l  =  £[Xfc  |Zfc_!]  =  $k-l*k-l\k-l 
Pk\k-i  =  Cov[xk\2ik-\\  = 

Similarly,  p(zjt|Zfc_i)  is  Gaussian  and 

E[zk\^k-i]  ~  Hk*k-i*k-i\k-i 
Cov[zt;\1k-i)  =  HkPk\k-\Hk  +  Rk 
Finally  p(zjk|x*)  is  Gaussian  with 

E[*k\xk)  =  HkXk 

Cou[zjk|xjt]  =  Rk- 

Using  (2.9  -  2.11)  in  (2.3)  gives 


(2.9) 


(2.10) 


(2.11) 


p(x*|Zt)  = 


(21r)»/^|fi1|>/2|Pt|l_1|l/2 


x  exp  {-t[(xi  -  x»n-i)TP4|j_,(xi  -  xt|i_,) 

+  (®1  -  HtXk -  Hk*k) 

+  (21  -  «l*i|i-,)r(WiPi|i-,/.'l  +  Rk)-'(H  -  fflXin-i)]}. 


Completing  the  square  in  the  exponent  gives 


p(xi|Zi)  = 


+  Rk \'/2 

(2x)"/2|fii|i/2|Pi|i_I|i/2 


(2.12) 


where 


xiji  =  xi|i_i  +  A'iZi 

Pk\k  =  (/»  -  KtHt)Ptit  , 


(2.13) 
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Kk  is  called  the  Kalman  gain  and  is  given  by 


Kt  =  +  Rt)-\  (2.14) 

and  the  innovations  z*  axe  defined  by 

it  =  (zfc  -  Hk*t\t-i)-  (2-15) 

The  filter  error  covariance  in  (2.13)  can  also  be  expressed  as 

p^t  =  P^-x  +  HkR-t'Ht  (2.16) 

where  /„  is  the  n-dimensional  indentity  matrix.  Since  the  a  posteriori  density  is 
Gaussian,  xt|*  is  the  most  probable,  the  conditional  mean,  and  the  minimax  esti¬ 
mate. 


Equations  (2.9)  and  (2.13)  constitute  the  Kalman  filter  equations.  These 
equations  give  the  optimal,  or  minimum  variance,  estimator  for  linear  systems  in 
additive  white  Gaussian  noise.  Equation  (2.9)  is  used  to  extrapolate  or  predict  the 
estimate  from  time  k  —  1  to  time  k  based  on  the  plant  characteristics.  Equation 
(2.13)  updates  or  filters  this  estimate  at  time  k  based  on  the  measurement.  An 
important  note  about  the  filter  equation  (2.13)  is  that  the  filtered  estimate  x*|*  >s 
a  linear  function  of  the  innovations  for  the  optimal  linear  filter.  It  will  be  shown 
in  Chapter  3  how  higher  order  powers  of  the  innovations  can  be  used  to  develop 
filters  for  linear  systems  in  non-Gaussian  noise  with  symmetrical  and  asymmetrical 
probability  density  functions. 

The  Kalman  filter  equations  can  be  derived  in  many  ways.  Gelb  [6]  uses 
the  matrix  minimum  principle  on  the  a  posteriori  variance  to  obtain  the  Kalman 
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filter  relations.  Kronhamn  [7]  derives  the  filter  by  geometrically  demonstrating  the 
orthogonality  of  the  estimation  error  to  the  measurement  error.  Chui  and  Chen  [8] 
use  stochastic  operator  theory.  Jazwinski  [9]  uses  stochastic  calculus  to  come  up 
with  the  relations  for  continuous-time  systems.  Kailath  [16]  derives  the  filter  using 
the  innovations  method.  Using  this  technique  the  observed  process  is  first  converted 
to  a  white  noise  process  by  means  of  a  causal  invertable  linear  transformation.  The 
problem  then  becomes  one  of  parameter  estimation  in  white  noise.  The  solution  to 
this  simplified  problem  can  then  be  expressed  in  terms  of  the  original  observations 
by  means  of  the  inverse  of  the  original  whitening  filter. 

Although  the  Kalman  filter  is  an  optimal  estimator  for  linear  systems  in 
Gaussian  noise,  its  performance  for  nonlinear  models  and  non-Gaussian  noise  is 
highly  dependent  on  the  degree  of  nonlinearity  or  non-Gaussianity  in  the  plant  and 
measurement  equations.  Nonlinear  models  are  generally  treated  with  the  extended 
Kalman  filter  in  which  the  state  and  measurement  models  are  linearized  about  the 
most  recent  estimate.  This  method  generally  works  well  in  low  noise  environments. 
In  large  noise  environments,  where  the  estimation  error  is  large,  the  Taylor  series 
expansions  can  be  very  inaccurate  [6]. 

Filtering  in  non-Gaussian  noise  has  generally  been  treated  in  the  literature 
using  recursive  Bayesian  estimators  which  rely  on  an  approximation  of  the  a  poste¬ 
riori  density  of  the  state  variables.  These  Bayesian  approximations  are  discussed  in 
the  next  section. 

2.3  Bayesian  Approximations 

Two  problems  are  encountered  when  the  the  system  is  nonlinear  or  the  a 
priori  density  is  non-Gaussian.  First,  the  integration  in  equation  (2.4)  is  difficult 
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to  carry  out.  Second,  the  moments  are  not  easily  obtained  from  equation  (2.3).  If 
the  conditional  density  function  cannot  be  computed  analytically  then  the  next  best 
thing  is  to  form  accurate  approximations  of  this  density.  Several  numerical  methods 
have  been  developed  for  approximation  of  the  a  posteriori  density  function.  Some 
of  these  techniques  are  briefly  discussed  in  this  section. 

Alspach  and  Sorenson  [10,11]  attempt  to  approximate  the  a  priori  density  us¬ 
ing  a  sum  of  Gaussian  distributions.  They  apply  their  system  to  problems  involving 
nonlinear  state  and  measurement  systems  in  white  Gaussian  noise.  The  procedure 
results  in  parallel  operation  of  several  Kalman  filters.  There  are  as  many  Kalman 
filters  as  there  are  terms  in  the  Gaussian  sum.  The  convex  combination  of  these 
filters  is  formed  to  obtain  the  a  posteriori  density. 

Sorenson  and  Stubberud  [12]  approximate  the  a  posteriori  density  using  an 
Edgeworth  expansion.  Using  perturbation  techniques  the  plant  and  measurement 
systems  are  described  as  quadratic  equations  with  additive  white  Gaussian  noise. 
Recursion  relations  are  derived  for  a  finite  number  of  the  moments  of  the  Edgeworth 
expansions  and  these  relations  are  assumed  to  describe  the  set  of  sufficient  statistics 
for  the  system. 

Bucy  and  Senne  [13]  use  a  crude  convolution  summation  involving  an  ellipsoid 
tracking  technique  to  determine  the  important  points  to  include  in  the  summation 
for  the  conditional  density.  They  assume  that  the  conditional  densities  of  interest 
are  sufficiently  non-Gaussian  so  that  a  finite  number  of  moments  make  for  a  poor 
representation  of  them.  They  store  the  densities  as  a  vector  of  point  masses  relative 
to  a  rectangular  grid  which  is  free  to  be  rotated  and  translated  in  the  state  space  of 
the  dynamical  system. 
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Another  method  is  to  use  spline  filters  [14,15]  to  construct  the  a  posteriori 
density.  Masi  et  aJ.  [17]  studies  nonlinear  discrete  time  filtering  problems  using  the 
Bayesian  approach.  The  solution  to  the  filtering  problem  is  given  in  terms  of  a  gener¬ 
alized  finite-dimensional  filter  in  a  sense  that  the  generalized  a  posteriori  conditional 
PDF  is  representable  as  a  linear  combination  of  distributions  belonging  to  a  given 
parameterized  family,  where  the  number  of  terms  in  the  combination  may  possibly 
vary  with  time.  Using  this  concept  they  are  able  to  derive  a  technique  to  obtain 
exact  recursive  solutions  for  various  linear  models  with  non-Gaussian  disturbances, 
as  well  as  for  one  non-linear  model  with  Gaussian  disturbances. 

All  of  these  methods  involve  numerical  approximations  to  the  actual  a  pos¬ 
teriori  density.  The  major  limitation  to  these  approaches  is  the  computation  time 
required  for  implementation. 

2.4  Nonlinear  Filtering  in  Gaussian  Noise 

This  section  presents  a  discussion  of  filtering  methods  for  nonlinear  systems 
that  are  described  by  the  equation 

Xjk  =  ffc-i(xfc-i)  +  rt_1wi_1,  (2.17) 

with  measurement  model 

z*  =  hk(xt)  +  v*,  (2.18) 

where  v*  and  w*_j  are  mutually  independent  white  Gaussian  noise  sequences  as 
described  by  equation  (2.7). 

In  general,  optimal  Bayesian  solutions  cannot  be  expressed  in  closed  form  for 
this  model,  requiring  methods  for  approximating  optimal  nonlinear  filters.  Several 
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nonlinear  filters  have  been  used  for  nonlinear  systems  in  Gaussian  noise.  All  of  these 
filters  are  based  on  the  model  of  the  filtered  state  being  a  linear  function  of  the  in¬ 
novations  sequence  as  in  equation  (2.13).  These  suboptimal  nonlinear  filters  include 
the  extended  Kalmar,  filter,  the  modified  second  order  Gaussian  filter,  the  locally 
iterated  Kalman  filter,  and  the  minimum  variance  filter.  These  filters  are  described 
in  sections  (2.4.1  -  2.4.4)  respectively.  Jazwinski  [9]  points  out  that  it  is  difficult  to 
assess  a  priori  the  effects  of  the  approximations  made  by  these  nonlinear  techniques, 
and  their  value  in  a  particular  problem  must  be  determined  by  simulations. 

2.4.1  Extended  Kalman  Filter  (EKF) 

The  extended  Kalman  filter  is  obtained  by  making  Gaussian  assumptions 
about  the  a  posteriori  densities  and  by  extending  the  plant  and  measurement  non- 
linearities  in  a  Taylor  series  including  first  order  terms. 

The  prediction  error  is  defined  as 


=  xfc  -  (2.19) 

and  the  filter  error  as 

^■k— l|i— 1  ~  *1-1  Xjfc— ]|t— 1  •  (2.20) 

If  f*__i  is  expanded  about  the  current  estimate,  then  the  first  order  ap¬ 

proximation  is 


f*-i(x*-i)  w  +  ■f*-ix*_1||r_i,  (2.21) 


where 


Fk- !  = 


dffc-i(xfc.i) 


dxk-i 


(2.22) 
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In  this  case  ffc-i(xk-i)  =  fjk-i(Xjt_i|i-i)  and  the  prediction  error  becomes 

=  -fjb-iXjt_i|fc_i  +  Tk-i'Wk-i-  (2.23) 

This  leads  to  the  state  prediction  equations 
*t|t— 1  =  ft— l  (*t— l|t— l) 

=  £[*t|*-i*t|*-i]  (2.24) 

=  Fk-iPk-\\k-iFk-i  +  Yk-iQk-iTk-v 


z*  =  Hkxk\k-i  +  v*-  (2.27) 

The  filter  and  gain  equations  have  the  same  form  as  the  linear  Kalman  filter  and 
are  given  by 


*t|t  =  *k\k-i  +  *1** 

Pk\k  =  (In  -  KkHk)Pk\k-l 
Ki  =  Pkik.jHl(HkPk\t-iHk  +  Pk)-'- 


(2.28) 
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2.4.2  Gaussian  Second  Order  (GSO)  Filter 

The  Gaussian  second  order  filter  [6]  is  obtained  by  including  the  second  order 
terms  in  the  Taylor  series  expansion.  In  this  filter  it  is  assumed  that  all  errors  are 
Gaussian  and  therefore  all  odd  moments  are  zero. 


The  expansions  for  and  h*(.)  are  given  by 

f*_ l(x*— j)  «  +  ^d2(ft-l,Xt-l|t-l*Ll|t-l) 

hjt(xt)  »  h*(x*(*_,)  +  Hkxk\k-i  +  |52(h*,x*|*_1^i*_1) 

(2.29) 

where  the  operator  d2(e,B)  for  any  function  e(x)  and  any  matrix  B  is  a  vector 
whose  ith  element  is  defined  by 


d2(e,B)  =  trace 


A  A 

for  1  <  p  <  n,  1  <  q  <  n.  From  (2.29)  the  estimates  fjk-i(xk-i)  and  hjt(x*)  become 


A 

ft 


-l(xk— l)  =  ft-l(*t-l|t-l)  +  «  ^(fi-l^i-llt-l))  - 

'  2  1  lxt-l=*t-l|t-l 

ht(xt)  =  ht(xjfc|*_i)  +  ^  d2(h*,P*|*_i)|x^=.^  J . 


(2.30) 


The  innovations  vector  is  now 

zt  =  zjt  -  ^(Xijt.i)  -  -  ^(h*, ^  ^  •  (2.31) 
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The  GSO  filter  relations  are  given  by 

^Jb|Jfc-l  =  Fk-\Pk-\\k-\Fk-\  +  ^k-lQk-l^-l  +  ^i-l 
x*|t  =  *k\k-i  +  Kkzk  (2-32) 

Pk\k  =  (In  -  KkHk)Pk\k-l 
Kk  =  Pk\k-iHk(HkPk\k-iHj  +  Rk  +  Bk )'\ 


In  general,  the  matrices  Ak-\  and  Bk  contain  fourth  order  moments.  It  is 
assumed  that  the  prediction  and  filter  PDF’s  are  Gaussian  for  the  development  of 
the  Gaussian  second  order  filter.  This  assumption  leads  to  the  approximations 


*-#  -  4 


Bt 

*J  4 


E  9x  dx  (crmCi"  +  cPnC9m)^I— 

a, g,m,n  axP  oxq  axm  OX n 

d2hkj  d2hki 

E  di  (dpmdqn  +  dpndqm)  * 

9,q,m,n  oxP  oxq  OXm  OXn 


xk-l~xk~l\k-l 


xk=*k\k-l 


(2.33) 


where  fk-\i  denotes  the  element  of  ft*.  denotes  the  ttk  element  of  hjt(.), 

the  c’s  are  elements  of  Pk\k,  and  the  d's  are  elements  of 


Another  approximation  which  was  developed  by  Jazwinski  [18]  and  Bass  et 
ai.  [19]  is  the  truncated  second  order  filter.  Third  and  higher  order  central  moments 
are  assumed  to  be  zero  in  this  filter.  This  results  in  slightly  different  equations  than 
that  shown  above  for  the  GSO  filter.  This  filter  is  appropriate  if  the  conditional 
density  is  almost  symmetrical  and  concentrated  near  its  mean.  Still  another  version 
of  second  order  filters  is  the  modified  Gaussian  second  order  filter  [20]. 


The  appropriateness  of  the  type  of  filter  to  use  is  dependent  on  the  nonlinear- 
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ities  in  the  system  and  can  only  be  accurately  determined  by  Monte  Carlo  simula¬ 
tions.  It  is  difficult  to  analytically  determine  the  effects  of  nonlinearities.  However, 
Jazwinski  [9]  points  out  that,  in  general,  nonlinear  effects  appear  to  be  significant 
when  noise  inputs  are  small  and  the  estimation  error  variance  is  large.  Large  noise 
inputs  effectively  mask  nonlinearities.  In  addition  he  claims  that  measurement  non- 
linearities  become  significant  whenever  they  are  comparable  to,  or  larger  than,  the 
measurement  noise.  Thus,  if  the  measurement  noise  is  small,  neglected  measure¬ 
ment  nonlinearities  tend  to  bias  the  estimate  and  result  in  incorrect  weighting  of 
the  observations. 

2.4.3  Locally  Iterated  Kalman  Filter  (LIKF) 

The  locally  iterated  Kalman  filter  is  an  enhanced  version  of  the  extended 
Kalman  filter  where,  at  each  step  of  the  iteration  procedure,  the  measurement  non¬ 
linearity  is  linearized  about  the  state  estimate  obtained  from  the  EKF  equations. 
This  filter  was  first  introduced  by  Denham  and  Pines  [21].  The  procedure  is  to 
repetitively  calculate  x*|t,  Kk,  and  each  time  linearizing  about  the  most  re¬ 
cent  estimate.  To  develop  this  algorithm,  denote  the  ith  estimate  of  x*|*  by  x*|*(t) 
with  x*|fc(0)  =  Xjt|*_i  and  expand  h i(x*)  from  equation  (2.25)  in  the  form 

M**)  =  h*(x*|*(i))  +  #*x*(*(0 

where 

rr  dh*(x*) 

Mk  =  — x - 

***  *t=**|*(') 

=  x*  -  x*|*(t). 
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The  following  recursion  relations  are  developed  [6] 

*jfc|jfc(*  +  1)  =  +  Kk(i)[zk  -  ht(xt|fc(t))  -  -  xk\k(i)) 

A|*(*)  =  (Jn  -  Kk(i)Hk)Pk\k-i  (2.34) 

Kk{i)  =  Pk\k-iHl{HkPk\k_xHl  +  fl*)'1 

where  t  =  0, 1,  •  •  •.  The  number  of  repetitions  of  the  calculations  shown  above  can 
be  determined  by  requiring  the  magnitude  of  the  difference  between  successive  state 
estimates  to  be  less  than  some  small  number. 

Jazwinski  [9]  gives  the  local  iterated  Kalman  filter  a  probabilistic  interpreta¬ 
tion.  Between  observations,  the  conditional  mean  and  covariance  matrix  propagate 
according  to  first  order,  nonlinear  theory.  At  an  observation,  assuming  the  a  priori 
density  is  Gaussian,  the  filter  solves  for  the  conditional  mode  of  the  posterior  den¬ 
sity.  The  conditional  covariance  matrix  is  then  computed  according  to  first  order 
theory.  The  conditional  mode  is  then  used  for  the  conditional  mean. 

Some  disadvantages  of  the  LIKF  are  pointed  out  by  Andrade  Netto  et  aJ. 

[22]: 

(1)  The  iteration  scheme  may  converge  very  slowly.  This  may  occur  where 
the  initial  guess  x*|fc_i  lies  near  extrema  of  the  function  h*(.) 

(2)  The  a  posteriori  density  may  be  multimodal  and  the  iteration  procedure 
may  converge  to  local  modes  if  it  converges  at  all. 

Another  iteration  scheme  involves  global  iteration  [9].  After  processing  the 
data  (zi,Z2,  •  •  • ,  zk),  starting  with  the  initial  values  xo,  and  Po,  the  filtering  op¬ 
eration  is  completed  with  estimates  x*|*  and  Pk\k.  Then,  assuming  w*  =  0  the 
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backward  filter  is  implemented  with  x ^  and  35  initial  conditions.  This  gives 
smoothed  estimates  and  /fy*.  The  data  is  then  processed  with  the  forward  filter 
again  starting  with  x0|fc  and  Pq.  This  has  the  effect  of  changing  the  initial  statistic 
E[x  o]. 

2.4.4  Minimum  Variance  Filter  (MVF) 

The  nonlinear  filtering  techniques  discussed  thus  fax  are  all  based  on  a  Taylor 
series  expansion  of  the  nonlinear  equations  about  the  most  recent  estimate.  As 
such,  these  filters  are  subject  to  the  inherent  problems  of  local  linearizations  and 
may  lead  to  poor  performance.  Liang  and  Christenson  [23]  developed  filtering  and 
smoothing  algorithms  which  give  exact  estimates  at  each  iteration  of  the  filter.  They 
have  shown  that  for  certain  nonlinear  functions  such  as  polynomial  nonlinearities, 
exponential  functions,  and  sinusoids,  exact  expressions  for  the  state  estimates  can 
be  obtained  and  used  in  the  filter  relations  in  place  of  the  usual  approximations. 
At  each  step  in  the  operation  they  assume  that  the  prediction  and  filter  errors 
are  Gaussian.  They  have  compared  their  filter  to  the  EKF  and  other  filters  using 
numerical  examples  and  claim  that  their  filter  performs  much  better  than  the  EKF 
for  large  initial  error  variances. 

The  basic  premise  is  that  and  £[h*(xjk)]  can  be  determined 

analytically  such  that 

h*(x*)  =  £[hJt(xfc)]Xjt=ifcjfcl. 
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The  innovations  vector  has  the  form 

z*  =  Zfc  ~  hjt(xfc). 

The  filter  equations  have  the  general  form 
=f*-i(x*_!) 

Pk\k-l  =  +  Tk-lQk~lTl-l 

=  Xk\k-1  +  Kkzk  (2.35) 

Kk  =  £?[**, ^Mx*)7)  (Rk  +  £(h*(x*)h*(x*)r])r 

pk\k  =  Pk\k-i  ~  Kk  JE[hi(xfc)xJ’|fc_1] 

where 


=  Xfc  -  Xjfcjfc.! 

hjfc(x*)  =  h*(x*)  -  hjt(xjk)  (2.36) 

ffc-l(Xjfc-i)  =  f*_i(xk_!)  -  fife_i(xt_i). 

Analytical  expressions  for  £[xtffc(xfc)r],  E[xkhk(xk)T],  E[fk(xk)fk(xk)T]t 
and  J?[h*(xi)hi(x*)T]  are  required  in  order  to  form  exact  expressions  for  the  filter 
equations. 

It  is  important  to  note  that  the  filter  equations  developed  by  Liang  and 
Christenson  [23]  have  been  presented  before  (e.g.  Jazwinski  [9]).  However,  their 
contribution  is  the  development  of  exact  expressions  for  specific  types  of  nonlin¬ 
earities  including  polynomial,  exponential,  and  sinusoid  nonlinearities,  assuming 
Gaussian  a  posteriori  density  functions.  Liang  [24]  gives  general  expressions  for  the 


probability  density  functions  for  these  types  of  nonlinearities.  Liang  [25]  evaluates 
several  system  models  with  the  standard  EKF  and  the  MVF.  He  concludes  that,  in 
general,  the  MVF  performs  much  better  than  the  EKF  for  large  initial  error  vari¬ 
ances  and  small  noise  variances.  However,  when  the  initial  variances  are  small,  and 
the  noise  variances  are  not  too  small,  the  EKF  can  be  expected  to  perform  about 
as  well  as  any  other  filter.  He  also  claims  that  when  the  level  of  noise  inputs  is 
large  enough  to  effectively  cover  the  effects  of  nonlinearities,  no  particular  filter  can 
be  said  to  be  consistently  superior  to  any  other  filter.  In  most  cases,  however,  the 
MVF  should  outperform  all  other  nonlinear  filters  considered. 

Kramer  and  Sorenson  [3]  compare  the  performance  of  the  MVF  to  the  optimal 
Bayesian  estimator  for  a  specific  bilinear  model.  They  found  that  there  is  a  wide 
margin  between  the  performance  of  the  suboptimal  filter  (MVF)  and  the  optimal 
Bayesian  filter.  They  generalize  that  when  the  level  of  noise  inputs  is  large  enough 
to  mask  the  effects  of  nonlinearities,  point  estimators  such  as  the  MVF  and  EKF 
tend  to  perform  close  to  the  optimum.  However,  they  may  be  quite  sensitive  to 
initial  conditions.  However,  the  MVF  still  fails  to  capture  important  features  of  the 
a  posteriori  densities. 

2.5  Estimation  in  Non-Gaussian  Noise 

Most  of  the  work  done  in  filtering  non-Gaussian  Noise  has  been  done  from 
the  Bayesian  point  of  view.  These  techniques  are  discussed  in  sections  2.2  and  2.3. 
However,  with  a  few  notable  exceptions  there  very  little  work  has  been  done  in  the 
area  of  linear  filtering  of  systems  with  non-Gaussian  plant  noise,  measurement  noise, 
or  initial  error  variances.  Some  of  the  approximations  found  in  the  literature  will 
be  discussed  in  this  section. 
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Masreliez  [26]  developed  two  methods  for  non-Gaussian  filtering.  One  filter 
is  used  for  situations  in  which  the  observation  prediction  density  is  approximately 
Gaussian  at  each  stage,  but  the  observations  disturbances  are  non-Gaussian.  He  de¬ 
velops  the  filter  using  a  nonlinear  (“score”)  function  of  the  innovations  vector.  The 
second  filter  applies  to  the  systems  with  non-Gaussian  plant  noise  but  linear  mea¬ 
surement  noise.  They  compared  their  filter  to  the  exact  Bayesian  minimum  variance 
filter  developed  by  Alspach  and  Sorenson  [10]  using  a  sum  of  two  Gaussian  distribu¬ 
tions.  Simulation  runs  indicated  that  the  exact  MV  filter  and  the  approximations 
presented  in  this  paper  coincide  and  that  these  filters  outperform  the  Kalman  fil¬ 
ter.  However,  the  author  notes  that  the  score  function  is  very  sensitive  to  small 
errors  in  the  density  approximations.  They  suggest  that  ad  hoc  type  filters  may  be 
constructed  to  approximate  the  densities. 

Another  approach  is  taken  by  Rao  and  Yar  [27].  In  their  paper  they  developed 
two  filters  for  tracking  nonlinear  processes  for  scalar  models.  In  the  first  technique, 
called  the  polynomial  filter,  they  used  a  general  nih  power  of  the  innovations  process 
for  symmetrical  plant  noise,  measurement  noise,  and  initial  variance  distributions 
to  develop  relations  that  could  be  used  to  obtain  the  filter  gain(s).  However,  they 
consider  only  the  seal  so-  cases  with  symmetrical  distributions.  The  second  filter, 
labeled  the  measurement  noise  dependent  filter,  is  based  on  a  general  nonlinear 
model  of  the  innovations.  This  filter  is  constrained  by  the  fact  that  one  must  have 
exact  knowledge  of  the  measurement  noise  distribution. 

Verriest  [28]  proposed  a  filter  that  would  operate  in  multiplicative  or  non- 
Gaussian  noise.  This  filter  was  set  up  for  symmetric  distributions,  and  equations 
were  developed  based  on  linear  approximations.  However,  these  equations  were  not 
verified  with  numerical  results. 
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An  exact  formula  for  computing  the  conditional  mean  has  been  derived  by 
Daum  [29]  for  discrete  time  observations  with  non-Gaussian  measurement  noise.  The 
derivation  of  this  formula  is  based  on  a  certain  homotopy  function.  He  mentions  that 
in  order  to  use  his  formula  a  conditional  expectation  must  be  computable,  which  is 
not  generally  the  case  in  nonlinear  estimation  problems. 


Chapter  3 


High  Order  Filters  for  Estimation  in  Non- Gaussian  Noise 

In  this  chapter  high  order  vector  filter  equations  are  developed  for  estimation 
in  non-Gaussian  noise.  The  difference  between  the  filters  developed  here  and  the 
standard  Kalman  filter  is  that  the  filter  equation  contains  nonlinear  functions  of  the 
innovations  process.  These  filters  are  general  in  that  the  initial  state  covariance, 
the  measurement  noise  covariance,  and  the  process  noise  covariance  can  all  have 
non-Gaussian  distributions.  Two  filter  structures  are  developed.  The  first  filter  is 
designed  for  systems  with  asymmetric  probability  densities.  The  second  is  designed 
for  systems  with  symmetric  probability  densities.  Experimental  evaluation  of  these 
filters  for  estimation  in  non-Gaussian  noise,  formed  from  Gaussian  sum  distributions, 
shows  that  these  filters  perform  much  better  than  the  standard  Kalman  filter,  and 
close  to  the  optimal  Bayesian  estimator. 

The  new  filters  are  referred  to  as  high  order  filters  (HOFs).  For  both  of  these 
filters  it  is  assumed  that  the  5th  and  higher  order  moments  of  all  densities  are  neg¬ 
ligible.  As  such,  these  filters  are  approximations  of  the  optimal  minimum  variance 
solution.  However,  it  is  shown  through  simulation  experiments  that  these  filters 
can  approach  the  performance  of  the  optimal  minimum  variance  filter  under  certain 
conditions.  The  performance  of  the  HOFs  is  compared  to  the  standard  Kalman 
filter,  which  uses  only  first  and  second  moments,  and  to  the  optimal  Bayesian  esti¬ 
mator.  The  Gaussian  sum  distributions,  for  which  the  optimal  Bayesian  estimator 
has  been  derived  by  Sorenson  and  Alspach  [10],  were  used  as  the  test-bed  for  com¬ 
parison.  Unlike  the  measurement  noise  dependent  (MND)  filter  described  in  [26], 
which  requires  complete  knowledge  of  the  entire  a  priori  densities,  the  HOFs  re- 
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quire  knowledge  of  only  a  finite  number  of  moments  of  these  densities.  The  optimal 
Bayesian  estimator  of  Sorenson  and  Alspach  [10]  also  requires  the  accurate  knowl¬ 
edge  of  the  a  priori  densities  so  that  an  approximation  can  be  made  using  Gaussian 
sums.  All  techniques  previously  developed  for  non-Gaussian  filtering  are  computa¬ 
tionally  intensive.  The  HOFs  developed  here  share  that  characteristic.  However, 
they  are  much  less  computationally  intensive  than  the  Gaussian  sum  filter. 

3.1  System  Model 

Consider  the  problem  of  estimating  the  n-dimensional  state  vector  x*  from 
K  measurements  of  the  m-dimensional  vector  z*.  The  linear  plant  and  measurement 
equations  have  the  form 


X*  =  +  W*_j 


(3.1) 


z*  =  Hk*k  +  v* 

where  w*_i  and  v*  are  mutually  independent,  white,  zero-mean,  possibly  non- 
Gaussian  random  sequences.  The  uncertainty  in  the  initial  estimate  Xo  may  also 
have  a  non-Gaussian  distribution  and  is  independent  from  wjt_i  and  v*.  It  is  also 
assumed  that  the  2nd  through  4<A  moments  of  the  distributions  of  xo,  w*_i  and  v* 
are  known. 


The  Kronecker  product  operator  ®  [30]  is  implemented  in  order  to  use  2- 
dimension  matrix  operations  throughout  this  derivation.  The  Kronecker  product  of 
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an  m  x  n  matrix  A  with  a  matrix  B  is 

defined  by 

a\\B 

012 B  ••• 

oi  nB 

A®B  = 

021 B 

022  B 

I  • 

a2nB 

mQm\B 

O-mlB  •  •  • 

Omn5i 

(3.2) 


where  a,y  is  the  ijth  element  of  the  matrix  A.  The  kronecker  product  has 
higher  algebraic  order  than  multiplication. 


For  arbitrary  matrices  A  and  B  and  arbitrary  column  vectors  a  and  b,  the 
kronecker  product  has  the  following  properties: 


(Aa)  ®  ( Bb )  =  (A  ®  B)  (a  ®  b) 
(<4a)  g>  b  =  (A  ®  Im)  (a  ®  b) 
(A®B)T  =  At  ®Bt 


(3.3) 


a  <g>  bT  =  bT  ®  a  =  abT 

where  b  is  an  m-dimensional  vector,  and  Im  is  the  m  x  m-dimensional  identity 
matrix. 


In  the  development  to  follow  the  column  stack  operator  is  also  used.  If  ao 
n  x  n  matrix  A  consists  of  columns  ai,a2, '*’,an  then  the  column  stack  of  A  is 
defined  by 

cst (A)  =  (ajaj  •  •  •  aj  ]T  (3.4) 


cst(v4)  is  dimensioned  nnxl.  If  A  =  £[xxT]  then  cst(i4)  =  2?[x®  x],  where  E[.]  is 
the  expectation  operator. 


33 

The  semi-column  stack  is  defined  as  follows:  if  the  matrix  A  is  dimensioned 
nn  x  nn,  consisting  of  columns  a*,  a2,  •  •  • , an»  then  the  semi-column  stack  of  A  is 
given  by 

al  a»+l  •  ‘  •  a(n-l)*n+l 

a2  an+2  •  a(n_!^n+2 

•  •  •  • 

•  •  •  • 

•  •  • 

an  an+n  '  •  a(n-l)*n+n 

scst(j4)  is  dimensioned  nnn  x  n.  If  A  =  E[x.  ®  x  <g)  xT  ®  xT]  then  scst(yl)  = 
2?[x  <g>  x  ®  x  ®  xT]. 

The  2  nd,  3rd,  and  4*^  moments  of  the  random  vector  are  given  as 

£[w*  ®  wj]  =  Q^6kj 
E[wk  ®  w j®  wj]  =  Q^Skji 
E[ w*  ®  wj  ®  w,  ®  w£]  =  Q(khkjlm 

Similarly,  the  2nd,  3rd,  and  4**  moments  of  the  random  vector  Vj  are  given 
by  Rk  ^6kj,  Rk  and  Rfj.  The  moments  of  the  initial  estimation  error  are 

given  by  P0(2),  P^\  and  P0(4). 

Let  the  prediction  error  Xjt|t-i  and  the  filtered  error  Xjt|jb  be  defined  as 

**|k-l  =  x*  - 

(3.6) 

*k\k  =  **  -  x*(* 

where  the  hat  indicates  expected  value.  The  innovations  vector  is  given  by 

2*  =  zt  -  HkXk\k-i 

(3.7) 

=  tf***|*-i  +  v* 
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It  can  be  shown  [27]  that  £[(xjt  —  Xjtn_1)[zj:]  is  a  function  of  only  z*  so  that 

£[(x*  - 

i=0 

where  the  superscript  ®i  denotes  the  ith  kronecker  product  of  the  vector  z*. 
denotes  the  itk  order  filter  gain,  which  has  dimension  n  x  m*.  It  follows  that 

k  =  *k\k-i  +  TlKk)itt 

i= 1 

Using  (3.6)  the  expression  for  the  filter  error  becomes 

**|*  =  )»fi-  (3-8) 

«=i 

By  setting  =  0,  for  *  >  1,  the  standard  linear  Kalman  filter  results.  In  order 
to  bound  the  equations  for  the  derivation  of  the  high  order  filters  it  is  assumed  that 
z®'  is  negligible  for  t  >  I.  The  truncated  relation  now  becomes 

Xfc| k  =  **|k-i  -  2  Kl'] if1.  (3.9) 

i—0 

Equation  (3.9)  forms  the  basis  for  the  development  of  the  HOFs. 

3.2  Non-Gaussian  Filtering  for  Asymmetrical  Distributions 

The  non-Gaussian  filter  for  asymmetrical  distributions  is  derived  by  letting 
/  =  2  in  (3.9)  and  obtain  the  filter  error 

x*|*  =  **)*_!  -  ^0)  -  K^ik  -  Kjf}zf2  (3.10) 

It  is  required  that  i?[x*|*_i]  =  i?[v*]  =  0,  since  the  estimator  must  be  unbiased, 
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and  using  (3.7)  in  (3.10) 

£[*t|i-i]  =  0  =  -/rf1  -  K^E(if  ]  (3.11) 

where  E[zf2]  =  cst(HtPt*t-iHl  +  R^)-  Substituting  (3.11)  into  (3.10)  yields 

*t|»  =  xt|t_,  -  -  4%  (3.12) 


where  £*  is  defined  for  notational  convenience  as 

a  =  (if2  -  £[zf  ])  =  ((it  ®  it)  -  atfHtP^Hl  +  42>))  (3.13) 

which  is  a  second  order  function  of  the  innovations  with  E[£k\  =  0.  The  correspond¬ 
ing  filter  equation  is 

*t|»  =  %-t  +  Ki'ht  +  4%  (3.14) 


The  formulas  for  the  gains  JC™  and  result  from  the  requirement  for 
a  minimum  variance  solution.  Using  (3.12)  the  equation  for  the  variance  of  the  a 
posteriori  density  becomes 


Pkjk  =  £[**|**k|*] 

=  -Efoiifc-iXjEjfc.!!  -  T  -  T 

(3.15) 

-  +  K^E[zkz1]K^  T  +  K^E[zk<I]42)  T 

-  K™ +  K^E[(kil}K^  T  +  KfpEtoff ]KP  t. 


The  gains  are  then  determined  from  the  matrix  minimum  principal  [23]  by  evaluating 

(3.16) 


a*'1’ 


<2>'  =  o  atracei/*it> 


gtrace{Pj|/}  _  ^  _ 


dK{p 
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Carrying  out  these  operations  on  (3.15)  yields 

K't'  =  (St^gl  - 


(3.17) 


It  is  observed  that  i?[Gt(jn  singular.  This  is  a  consequence  of  the  fact  that 
contains  repeated  terms.  For  example,  if  the  dimensionality  M  of  the  innovations 
vector  is  2,  then  the  term  (5t(l)^t(2)— -F[S*(l)Si(2)]),  where  5*(j)  is  the  jth  of  z*, 
appears  twice  in  .  The  number  of  repeated  terms  is  a  function  of  the  dimensionality 
M  of  the  innovations  vector.  To  avoid  this  singularity,  define  the  collapsed  vector 
Ckc  such  that 

Ckc  =  TmIu  (3.18) 

where  Tm  is  a  matrix  of  l’s  and  0’s  designed  to  eliminate  redundant  columns  or 
rows  from  Ck-  For  example  if  M  =  2 


Tm  = 


1 

0 

0 

o' 

1 

0 

0 

O' 

0 

1 

0 

0 

or 

0 

0 

1 

0 

.0 

0 

0 

1. 

.0 

0 

0 

1. 

(3.19) 


Since  Ckc  <^oes  n°t  contain  repeated  terms,  [CtcC£]  is  nonsingular.  Let  K^J  denote 
the  collapsed  gain  associated  with  replacing  Ci  with  (kc  in  (3.17).  Solving  for 
and  I(j?J  yields 

41}  =  (Ei*k\k-i%)  -  Efaik-Jl]  EfCkcClr1  D 

X  (E[ikZTk]  -  ElzkCtiEfatZr'Efazl])^ 

KpJ  =  (E[xk\k-i(kc)  -  E[x.k\k-i*k]  E\zk*k]~l  E[ikCkc}) 
x  {E[(kc(kc\  ~  EfckAWkil]-'  E[zklkc])~l- 


(3.20) 


Equation  (3.14)  requires  that 
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>c.  =  <3-21) 

Using  (3.18)  in  (3.21),  Jfj2'  is  then  obtained  from 

tf<2>  =  Jfg’T*  (3.22) 

where 

^3‘23^ 

=  <L T  Hi®  H{  (3-24) 

*[MjT]  =  P*  P^.j  Hf  +  42>  (3-25) 


= £&*if]r  *  <L T  Hl  ®H*  +  *£3)  T  (3-26^ 

*[&31  -**«**  {<2-i  '  «t  (<2.1)c3t(Plj2_1)T}  hi  0  Hj 

+  <>  -  cst(P(fc2))  cst(Pi2))r 

+  Hk®lrn  <ll  «  42)  «  7- 

+  Hk®Im  *[x*|*-l  0  Vjt  ®  V*  0  X*|*_l]  ®  Hi 

+  Jm  ®  Hk  ^  0  Pj[|2_l  ®  Hk 

+  Im®Hk  2?[v*  ®  X*|*_l  0  xjj*.!  0  vj]  Hi  ®  Irn 

It  can  easily  be  shown  that  if  all  3r<*  moments  are  zero  then  Kj[2)  =  0  and 
reduces  to  the  gam  for  the  standard  Kalman  filter. 

Using  the  state  model  (3.1)  the  prediction  equation  becomes 


X*|*-l  =  **-lXfc-i|*-l. 


(3.28) 
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The  corresponding  prediction  error  from  (3.6)  is  given  by 


1**— i|fc— i  +  w*_!. 

The  prediction  moments  tire  then  be  evaluated  as 

= **-i^  + Q‘-i 

=  *t-l  ®  *Ll  +  Q[% 


(3.29) 


(3.30) 


(3.31) 


■*fcjfc-i  -  ^[(xfjjLi)  (*fjjLi)T] 

=  ®  **-!  FiVt  *£-!  «  *£-l  +  ^-1 

+  *k-i  0  /»  0  C?i22, 

+  $*_i  ®  cst(Pt(l)1|A_1)  cst(Q^221)T 

+  (g>  /„  ®  WjElj  ®  Wt_i  <g>  Xjfe.HJb.j]  /„  ® 

+  /n  0  $*-l  ^[w*_i  ®  xJLju.j  ®  Xt.jifc.!  ®  Wfc_!]  $j[_l  0  4 
+  cst(Q^ij)  cst(PJk(i)1|jfc_1)r  $JL,  ®  $Li 
+  40  ^Jfc— 1  Qfcii  0  0 


where  Jn  is  an  n-dimensional  identity  matrix.  Similarly,  the  moments  of  the 
filter  error  can  be  evaluated  using  equation  (3.12).  Let 

At  s  (/.  -  K^Hk) 


(3.33) 


39 


The  filter  variance  becomes 

Pk\l  =  £[**|Jfc*it|*] 

+  kPbpkpt 

-  K i2)  Hk  0  Hk  P$_ j  ATk  -  AtP™^  T  Hi  0  Hi  K™  T 

+  K?]  R(l]  k[1]  t  +  k[1]  T  k[2)  t 
+  JSf>  Hk  0  Hk  {P§>_1  -  cst(^jtj2_1)cst(PJt(|fc)_1)T}  Hi  0  Hi  *f)T 

+  k[2)  {R{k]  -cst(42))cst(42))7’}  tf£2)T 
+  ^2)  Hk  0  Im  P{k j2_!  ®  P^2)  Hi  0  Im  k[2)  t 
+  l^2*  H*  (2)  /m  £[**)*_!  0  vk  0  vl  0  xjjfc.j]  Im  0  Hi  t 

+  K(2)  Im  ®  Hi  42)  ®  P$_!  Im  0  Hi  K{2)  T 

+  k[2)  Im  0  Hk  E[\k  0  Xjtlt.j  0  0  Vjf]  Hi  0  Im  k12)  T  34^ 

It  is  observed  that  the  equation  for  the  nth  filter  error  moment  requires  the 

availability  of  prediction  and  measurement  error  moments  of  order  2 n.  This  is  a 
consequence  of  the  fact  that  the  filter  error  given  by  (3.12)  is  a  second  order  function 
of  the  innovations.  Since  only  the  prediction  order  moments  up  to  4<k  order  are 
propagated,  the  equations  for  the  3rd  and  4th  order  filter  moments  are  truncated  so 
that  they  contain  only  3rd  and  4<k  order  functions  of  the  prediction  and  measurement 
error  moments.  An  alternative  would  be  to  completely  expand  the  3rd  and  4<k 
order  filter  moments  in  terms  of  all  2n  prediction  and  measurement  error  moments 
and  approximate  the  higher  moments  using  suitable  functions  of  the  2nd  through 
4th  moments.  The  vector  expansion  becomes  very  unwieldy  and  is  not  included 
here.  However,  Section  3.5  contains  the  scalar  expansions.  Simulation  experiments 
presented  in  Section  3.7  compare  the  truncated  models  to  the  nontruncated  models 
which  use  higher  order  moment  approximation. 
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With  this  restriction  the  3rd  order  filter  moment  becomes 

=  Ak®Ak  ®Al-  Kg)  ®  Kl1]  R(k]Kil) T 

-Ak®  (Kg*  Hk  ®  Hk)  {scstCP^.j)  -  P^.,  0  cst^.j)}  Aj 

-Ak®Ak  {Pi §_!  -  cst(Pi|2_1)cst(Pij2_1)T}  Hi  ®  Hi  t 

-  (P<2)  Hk  ®  Pt)  ®  v4fc  {scst(P^)_1)  -  cst (Pig.j)  0  P^.j}  ATk 

-  Kg)  ®  P^scst^)  -  P(t2)  ®  cst(P(i2))}  Pi1)T 

-  Kg)  ®  Kg*  {Rg]  -  cst(Rg))cst(Rg)f}  Kg]  T 

-  Kg )  ®  P^scst^4*)  -  cst(p£2))  0  Pjt2)}  T 
+  Ak  0  (Kg]  Hk  ®  /m)  cstfPig^)  0  Rg)  Kg) T 

+  Ak  ®  (Pj2)  /m  0  Pi)  P[xi|i_!  ®  vf  0  vt  0  Xi|i_!]  P^  r 

+  Ai®  Pi1}  Pij2_!  0  Rg)  Hi  0  /TO  Kg) T 

+  Ak  ®  P^  P[xin_!  0  vf  0  xJii_!  ®  Vi]  lm  0  Hi  Kg) T 
+  Kg)  ®  (Pi2)  Pi  ®  /m)  P[vi  0  Xj|i_1  0  xi|i_ j  ®  Vi]  Al 
+  Kg)  ®  (Kg)  Im  ®  Pi)  cst(P(t2))  0  pg)_x  Al 
+  Pj1}  0  Ak  E[vk  0  5cJ’jJb_1  0  vf  0  Xi|i_j]  Hi  0  Im  Kg) T 
+  Kg)  0  Ak  Rg)  0  Pij2_j  Im  0  Hi  Kg) T 
+  (Kg)  Pi  0  7m)  0  P^  Pi2).j  0  cst(p£2))  Al 
+  (Kg)  Im  0  Pi)  0  Kg]  P[vt  0  Xiji—i  0  Xiji_!  0  Vi]  Al 
+  (Kg)  Pi  0  /TO)  0  Ak  E[x i|i_i  0  Vi  0  vf  ®  Xi|i_!]  Kg) T 
+  (Kg)  Im  0  Pi)  0  Afc  Pi2)  0  CSt(Pij2_,  )  T 


(3.35) 


The  fourth  order  moment  expansion  is  truncated  so  that  it  includes  only 
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functions  of  4<A  order  prediction  and  measurement  error  moments. 

=  £[(*?|i)  (*®i2fc)T) 

=  Ak  0  Ak  0  A]. 

+  Ak  ®  0  i£2)  0  aJ1}  T 

+  A  0  A*  cstfP^.j)  cst(P(fc2))T  IfW  T  ®  Kp}  T 

+  A*  0  -B[x*|jfc_i  0  vf  ®  vfc  ®  xjjt.j]  Jjfj1* T  0  A*  (3.36) 

+  ®  A*  £[v*  ®  x5t_!  0  Xjk|i_i  0  v*]  A*  ®  T 

+  J5^1}  ®  aJ1}  cst(42))  cst(PJ[j!Jt)_1)T  Af  0  Af 

+  4J)  ®  a*  42)  ®  jffl-,  Kl1}  T®ATt 

+  K{t]  0  44)  a*1}  T  ®  t 

The  filter  equations  (3.14,  3.34-3.36),  gain  equations  (3.20),  and  prediction 
equations  (3.28,  3.30-3.32)  constitute  the  discrete  filter  relations  for  non-Gaussian 
noise  with  arbitrary  asymmetrical  distributions.  These  relations  are  suboptimal  in 
that  they  do  not  completely  characterize  the  noise  distributions  since  they  use  only 
the  first  four  moments  of  the  distributions. 

3.3  Non-Gaussian  Filtering  for  Symmetrical  Distributions 

The  derivation  for  the  non-Gaussian  filter  for  symmetrical  distributions  fol¬ 
lows  the  same  general  procedure  as  in  the  previous  section.  If  the  errors  are  assumed 
to  have  only  even  moments,  then  it  can  be  shown  that  =  0  for  i  =  0,2,4, 

[5].  The  truncated  non-Gaussian  filter  for  symmetrical  distributions  is  obtained  by 
letting  I  =  3  in  equation  (3.9)  and  obtain  the  filter  error 

)• 


(3.37) 


with  corresponding  filter  equation 
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**|*  =  **|*-i  +  4"**  +  4"  if3-  (3.38) 

The  estimator  is  required  to  be  unbiased.  By  definition  all  odd  moments  of  the 
innovations  are  zero.  Since  ^[xjtn-i]  =  0,  the  expected  value  of  the  estimation 
error  given  in  (3.37)  is  zero.  For  notational  convenience  let 

ak  =  if3  (3.39) 


with  E[a *]  =  0. 


The  formulas  for  the  gains  and  result  from  the  requirement  for 
a  minimum  variance  solution.  The  variance  of  the  a  posteriori  density  function  is 
given  by 

Hv  =  £[**1**5*] 

=  £[**|t-i**i*-i)  -  £[*t|*-ii*]£*1)  T  -  £[**i*-i«*  ]£*3)  r 

(3.40) 

-  4”£[****it-il  +  41,£iM*W*l,T  +  Jtf’sMUtf*1- 

-  43)£[&***i*-i]  +  jrf  T  +  41>£[**S*  ]£ls)  T- 

From  the  matrix  minimum  principal  [23] 


5trace  } 


dK 


(i) 


k\k>  __ 


=  0, 


^tracelPtjt } 


dx[3) 


k\kj_  _ 


=  0. 


(3.41) 


Carrying  out  these  operations  on  (3.40) 

k[']  =  (£I**|*-iif]  -  4’ £[“***))  X  E\ztz£)-' 

4*  =  (£[**|*-i 41  -  4’^tS*])  »  £[a*i*]"'- 


(3-42) 
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Similar  to  the  •£[&£*']  f°r  the  asymmetric  filter,  it  is  observed  that  £[0*0^]  is 
singular.  A  collapsed  vector  orjtc  is  defined  such  that 

&kc  =  UM&k  (3-43) 

where  Um  is  a  matrix  of  l’s  and  0’s  designed  to  extract  only  one  of  each  term  from 
ak.  Since  akc  does  not  contain  repeated  terms,  [0^0^.]  is  nonsingular.  Let  Kk3J 
denote  the  collapsed  gain  associated  with  replacing  a*  with  akc  in  (3.42).  Solving 
for  and  yields 

K[1}  =  (E[xk |fc_izjn  -  £[x*|i_1o&]  E[akeale]-1  E[akczl\) 
x  (£[z*z*]  -  E[zkalc]  E[akc&lc]-'  E[akc z^])-1 

(3.44) 

K%}  =  (£[x*|*_iaj[j  -  £(xt|A_,zf]  E\zkzl)~l  E[ z*g£]) 
x  (E[akcalc\  -  E[akczl]E\zkzlYx  E[ it&J.])-1. 

Equation  (3.38)  requires  that 

Ki3)at  =  K^atc  (3.45) 

Using  (3.43)  in  (3.45),  is  then  obtained  from 

43) = <3)  u„. 


(3.46) 
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Define  the  parameters 

Bkl  =  Hk®  Hk®  Hk 
Bk2  =  Hk  ®  Ht  0  /m 
■5*3  =  Hk®  Im  <8  Hk 

5jt4  —  5 i  ®  /m  ®  /m 

(3.47) 

Bks  =  lm®  Hk  ®  Hk 
Bk6  =  Im®  Hk®lm 

Bk-j  =  Im  0  Im  ®  Hk 
Bk%  =  Im  ®  Im  ®  Im 

The  expectations  given  in  equation  (3.44)  become 

5[x*,*-iZjn  =  ptfl.  Hi  (3.48) 

5[%-,af]  =  sc8t(Pjfc(|,J>_1)TBj;  +  P$-i®™Wk)f  bja 

+  E[vl  ®  Xfcit.j  ®  xjj^j  ®  vl]  Bl^  +  cst(42))T  0  Pjft^  Bl7 

(3.49) 

E\zkiTk)  =  Hk  P$_,  Hi  +  42)  (3.50) 

=  5[Qfczf]T 

=  HkicstiP^f  Bl'  +  PJ[2fc)_1®cst(42))T5fcT4 

+  £(vj[  0  x*)*.!  ®  Xi|i_j  0  vl)  Bl6  +  cst(ft<2))T  ®  P$_j  Bl7 

+  cst(5fcjt_i)T  <8>  ^2)  5f2  +  ^[xtit-i  0  v[  ®  v[  ®  xjjjt.j]  5^ 

+  0  cst(PJ[j2_1)r  5^  +  scst(P^)r 


(3.51) 
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JSfaaZl  =  £*,<iLX  +  ®  cs^W, 

+  Bl6C„  +  Bhat(R^)  ®  scst(P«-,  W, 

+  A2<L  ®  MX  +  BhCi2Bl 

+  BhCt3Bl2  +  ®  scst(44))TB2; 

+  B^X  +  Bt3Ci,sBl3  +  BtsCieB?3  +  Bt3Ct3Bl3 
+  B*jSC8t(Pj[j2_j)  ®  cstX’fBf,  +  ®  *X 

+  Bfc6C*gBjfc4  +  BbjCkgB^  +  Bt2CtjgBji5  +  Bt3Ck33Bh3 

+  Bt542)  ®  P$_X5  +  B^URi'1)  ®  cSt(B<ji).1fBj; 

+  Bfc]  +  Bk3Ck33B33  +  BtgCtj^Bjg  +  B^jCt^B^ 

+  Btlcst(MV  ®  scst(P$_,)B2;  +  B*,C4lX 
+ b*sc*X + b*X'  ®  MIX 

+  B^cstlP®.,)  ®  scst(M‘>X  +  BhCtl3Bl3 
+  Bl3*a,t(RWf  ®  cstfpW.X,  +  Bt,4X 


(3.52) 


The  parameters  are  6<A  order  functions  of  expectations  of  the  measure- 
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ment  and  prediction  errors.  They  are  defined  by 

Ckl  =  E[vk  ®  5cjt|fc_1  0  xf|fc-i  0  xjjfc.j  0  xJjjt.j  0  vt] 

Ck2  =  E[X fcjfc-!  <0  Vfc  0  Xfcjjt.!  ®  xj jfc_j  ®  xf|t_l  ®  vf] 

Ck3  =  £[v*  ®  Xk\k-1  ®  Xjb|t-1  ®  XjEjjfc.j  ®  xf|t_i  ®  vfl 

67*4  =  ^[xjtji-1  ®  Xtjjt.j  ®  vf  ®  xf|Jt_,  ®  vf  ®  xfiJt.j] 
ch  =  £[xjt|*_i  ®  V*  ®  Xfcii.j  ®  xfn_!  ®  vf  ®  xf|t_i] 

Ck6  =  £lV*  ®  ®  ®  xf|t_j  ®  vf  ®  xjj  i_j] 

Ck7  =  £[v*  ®  Vfc  ®  Vi  0  xjjjt,!  ®  vf  ®  xjEjt-i] 

Ct8  =  £lv*  ®  ®  v*  ®  *f|*-l  ®  vf  ®  vf] 

Ctg  =  E[\ k  ®vk®  xt| k-1  ®  xf|t_!  ®  vf  ®  vf  ] 

(3-53) 

CkW  =  ^(x»|*-l  ®  ®  Vfc  ®  vf  ®  xf|fc_j  ®  xfn_j] 

Ctn  =  £[xt|t-l  ®  V*  ®  Xt|fc-i  ®  vf  ®  xf|fc_!  ®  xflfc-i] 

C*12  =  £[x*|*-l  ®  ®  *k\k-l  ®  vf  ®  ®  vf ] 

Cjfc13  =  £[Xi|fc-l  ®  V*  ®  Vfc  ®  vf  ®  xf||._1  ®  vf  ] 
ck  14  =  £[vt  0  Xjfclfc.!  0  Vt  ®  vf  ®  xf|t_!  0  vf  ] 

<?t15  =  ^(V*  ®  V*  ®  *t|i-l  ®  vf  0  xf|t_!  ®  vf] 

Cku  =  E[X t|t_i  0  vt  0  vt  ®  vf  0  vf  ®  xfjt-i] 

Ck„  -  E[\k  0  Xt|t_i  0  vt  0  vf  0  vf  ®  xf|t_j] 

Ckn  =  £[xfc|t-l  0  Vt  0  Xt|t_!  0  vf  ®  vf  0  vf  ] 


The  remaining  work  involves  evaluation  of  the  prediction  error  moments  and 
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the  filter  error  moments.  The  prediction  error,  5cjt|jfc_i  is  given  by  (3.29).  The  2nd  and 
4th  prediction  moments  are  generated  from  the  prediction  errors.  These  moments 


are  expressed  as 


pk\Li  =  ri 


(3.54) 


Pk\k-i  ~  ^[(**|fc-i)  (**|*-i)Tl 

=  **-i  ®  **-i  Pi%_,  ®  *I_i  +  Q{th 

+  **_!  0  In  0  ®  /« 

+  $*_1  0  $*_1  cst(PJ[i>1Jjfc_1)  csKQEjf 

+  $*_!  0  /„  £(**_!!*_!  0  wJL,  0  Wfc_!  0  /„  0 

+  /n  ®  $*-l  £[W*_!  0  xLl|*-l  ®  *k-l\k-l  ®  w*-l]  **-l  ®  4 

+  cst(g^j)  cst(pJ[!)1|t_1)T  $£_!  0  i 

+  /« ®  **-i  «i22i  ®  /» ®  *Li 


(3.55) 


Similarly,  using  equation  (3.37)  the  moments  of  the  filter  error  can  be  eval¬ 


uated.  Let 


At  =  (/.  -  K^Ht) 


(3.56) 
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Then  the  filter  variance  becomes 

+  4,,'WT 

-  AtvxUPft^f  Biy?  T  -  *f)B4jSat(P$_1Mj|' 

~  a*P$L,  ®  «t(4 2)fBjtKi3)  T  -  ^3>Bi4cst(jei2))  ®  P^_tA[ 

-  AtE[x[lt_,  ®  Vi  ®  vj  ®  t 

-  Kk ]BksE[xnt.,  ®  vt  ®  vl  ®  xti,.,]^ 

-  Ata,t(R^  f  ®  pIIUb^K^  t  -  Kl3)Bk,P$_,  ®  cst(Rl3,)A[ 

+  ff|,)cBt(J$j-1)T  ®  R{k]BThK{3)  T  +  K{3)BhR^  ®  csttP®,,)^1' T 

+  K^ElvI  ®  ®  x^_,  ®  wl)Bl3K^  t 

+  ®  xtn_x  ®  xjjj_i  ®  Vil/fW  t 

+  K^R™  ®  cst  (P$.i)T  T  +  ®  R[2)R{t)T 

+  K^scstfR^fR^  T  +  ^JW(44>)jri»  T  (3  57) 

Equation  (3.37)  dictates  that  the  filter  variance  should  include  6th  order  func¬ 
tions  of  the  prediction  and  measurement  error  moments.  These  higher  order  terms 
are  not  included  in  the  filter  variance  expression,  just  as  all  5th  and  higher  order 
terms  are  disregarded  in  the  development  of  the  asymmetrical  filter.  By  doing  so 
it  is  implicitly  assumed  that  the  contributions  from  these  higher  order  terms  are 
negligible.  As  noted  previously,  if  these  terms  were  included  in  the  derivation  of  the 
filter  moments  it  would  necessitate  some  approximation  procedure  for  these  high 
order  prediction  and  measurement  moments,  since  only  2nd  and  4th  order  prediction 
moments  are  propagated.  Similarly  the  4th  order  filter  moment  requires  the  avail¬ 
ability  of  6th,  Sth,  10th,  and  12th  order  prediction  and  measurement  error  moments. 
Again  the  4th  order  expansion  is  truncated  to  include  only  4th  order  functions  of  the 


49 


prediction  and  measurement  errors.  The  resulting  moment  equation  becomes 

=  Ak  ®  Ak  Pkfk-1  Af  ®  Al 
+  Ak  ®  A^  ®  42)  ®  K[1]  t 

+  Ak®Ak  cstipW^)  cst  (R^f  K™  T  ®  A*1*  t 
+  Ak  ®  A*1)  A[xt|jt_i  ®  v*  0  vjt  <g>  x^.j]  A^1*  T  <8  Al  (3.58) 

+  Af*  ®  A*  E[vk  8  xJjA_1  ®  xt|t_i  8  vk]  Ak  8  k[^ t 
+  A^  ®  Aj[1}  cst(42))  cstiP^f  Al  8 
+  Aj[1}  <2>  At  aJ2)  ®  Aj[1}  r  ®  Al 

+  Ajx)  ®  Aj[1}  44)  Aj[1}  T  ®  A^1}  T 

The  filter  equations  (3.38,  3.57,  3.58),  gain  equations  (3.44),  and  prediction 
equations  (3.28,  3.54,  3.55)  constitute  the  discrete  filter  relations  for  non-Gaussian 
noise  with  arbitrary  symmetrical  distributions.  Similar  to  the  asymmetrical  filter, 
these  relations  are  suboptimal  in  that  they  do  not  completely  characterize  the  noise 
distributions  since  they  make  use  of  only  the  first  four  moments  of  the  distributions. 

3.4  Nonlinear  Non-Gaussian  Filtering 

It  is  straightforward  to  extend  the  high  order  filters  that  have  been  derived  for 
linear  plant  and  measurement  models  to  nonlinear  plant  and  nonlinear  measurement 
equations  in  non-Gaussian  noise.  Using  linearized  models  based  on  1**  order  Taylor 
series  expansions,  replace  $*_i  and  Hk  in  the  linear  model  non-Gaussian  filters  with 

pk  _  dffc(xt 

xk=ik-i\k-i 

„  dhk(xk 

Hk=  ~teT 


x*=**l*-l 
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where  fjt(xfc)  and  hfc(xjt)  are  the  system  and  measurement  nonlinearities.  The  equa¬ 
tions  for  the  nonlinear  non-Gaussian  filter  have  the  same  form  as  those  for  the  linear 
non-Gaussian  filter.  However  the  nonlinear  filter  requires  the  computation  of  Fk  and 
Hk  at  every  iteration. 

Likewise,  the  locally  iterated  Kalman  filter,  which  is  discussed  in  Section 
2.4.3,  can  be  obtained  from  the  non-Gaussian  filter  equations.  The  locally  iterated 
non-Gaussian  filter  requires  the  computation  of  all  the  filtered  estimates,  all  the 
gains,  and  all  the  moments  of  the  a  posteriori  density  function  on  every  iteration  in 
each  step  in  the  filtering  process. 

3.5  Non-Gaussian  Filters  for  Scalar  Models 

In  this  section  the  scalar  equations  for  the  symmetrical  and  asymmetrical 
filters  are  presented.  It  is  shown  that  these  filters  reduce  to  the  Kalman  filter 
equations  for  the  special  case  of  Gaussian  noise.  In  contrast  to  the  development  of 
the  vector-based  filter  equations  in  which  the  equations  are  truncated  in  order  to 
reduce  the  filter  complexity,  the  filter  equations  for  the  scalar  models  are  derived 
without  truncation. 

The  scalar  state  and  measurement  equations  are  given  by 


Xjfc  =  <t>k-\Xk-\  +  v>k- 1 


Zk  =  hkxk  +  Vk 


(3.59) 


The  predicted  estimate  is  xjfc|t_i  =  <f>k-i2k-i\k-\  and  the  prediction  error 
=  Xk  ~  is  given  by 


+  wk-i 


(3.60) 
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3.5.1  Scalar  Asymmetrical  Filter 


The  scalar  equations  for  the  asymmetrical  filter  are  presented  in  this  section. 
These  equations  have  the  same  form  as  the  vector  equations  given  previously  with 
the  exception  that  the  moment  equations  are  not  truncated.  When  the  measurement 
noise,  process  noise,  or  initial  estimation  error  has  asymmetrical  distributions  the 
filter  equation  for  the  scalar  model  is  obtained  from  (3.14) 


where 


(3.61) 


(3.62) 


The  filter  error  xk\k  =  xk  —  Xjtjt  becomes 

*k\k  =  -  tfhk  - 


(3.63) 


The  scalar  filter  gains  (3.20)  become 

,  (l)  -El**!*-! **]£[<»*]  - 

*  E[zl)E[al)  -  E[zk<*k? 

.  (2)  •£[**!*- iQ*]£[5jfc]  -  Elx^jZtlElztQk] 

k  E[*l)E[zl\  -  E\zkaky 


(3.64) 


where 


E[xk\k-\~zk]  = 

E[<*kXk\k-i)  =  fcingL 
E[akzk\  =  fcJpSlLi  +  ri3) 

^[*8  =  AiPi|l_i  +  ri2) 

£[«*]  =  2  +  HlfiLi  +  Ahkrk]Pk\k-i  ~  rl2>  2  +  rl4) 


(3.65) 


The  scalar  versions  of  the  prediction  moments  obtained  from  (3.30-  3.32)  are 


given  by 


P$Ll  =  4-lP*-l|fc-l  +?*-! 

Pi|Li  =  4>l-lPkh\k-i  +  9kh 
Pfc|Ll  =  ^-lPjfc-Hk-l  +  6^-i9fc-lPfc-iH-i  +  9*-i 


(3.66) 

(3.67) 

(3.68) 


The  2nd  moment  p^j[  =  can  be  partitioned  into  2nd, 3rd,  and  4th 

order  components  of  the  measurement  and  prediction  moments.  Let 


J2)  -  d(2)  +  d(2>  +  d(2) 
Pi|k~Pi|k2+Pt|i3+Pk|k4 


(3.69) 


where  pj^.  consists  of  ith  order  moments  of  the  measurement  and  prediction  error. 
The  pj^j[.  are  given  by 


p* Ji2 = a*PtiLi  +  (41)]2»’12) 

Piik3  =  241)42)rl3)  -  2a*42)ftkPl|Li 


(3.70) 

(3.71) 
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Pii*4  =  [42)]2(^(p1JLi  -  bijiLi]2) + 4/iiPiiLiri2)  -  irl2)]2 + rl4))  (3-72) 

where  at  =  1  —  hkk^\ 

The  3rd  moment  pj^j[  =  can  be  partitioned  into  3th,4ih,bih,  and  6th 

order  components  of  the  measurement  and  prediction  moments.  Let 


where  consists  of  ith  order 

/a\ 


(3)  (3)  .  (3)  .  (3)  .  (3) 

/•nm  nnn  Anl  c 


(3.73) 


(3-M) 

P$4  =  al*i2)A2»(31pg>.,]J  -  3p11Li)  +  I^WPI^]2  -  3r£*>) 

(3.75) 

+  nakkM)^lA) 

pli*5  =  «i[42)]2(Ai(3pl|Li  -  6PiiLiplil-i) + l2A*Pfc|Liri2) 

+  I2fc*p$_iri3))  +  *l1)[fci2)]2(“12fcipSJ-irl2)  (3-76) 

-  12^p1|Li^3)  +  -  3r<5)) 

P*|*6  =  [42)]3(At(-2bt|t_i]3  +  3P$-iP$-i  -Pi|l-i) 

+  ^*(12bt|Li]2r*2)  -  12P*|i— ir*2))  ~  20/lfcPt|Liri3)  (3-77) 

+  ^P(t2Li(12bi2)]2  -  12r^)  -  2[ri2>]3  +  3r<’>r<<>  -  rf ) 


The  4th  moment  pj^  =  E[x  ^|fc]  can  be  partitioned  into  4th,  5th,  ,8th  order 
components  of  the  measurements  and  prediction  moments.  Let 


D(4)  _  J4)  +  J4)  +  J4)  4.  J4)  4.  J4) 
Pk\k  ~  Pk\kA  +  P*|*5  +  Pklk6  +  Pklk7  +  Pi|i8 


(3.78) 
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where  pj^j[.  consists  of  ith  order  components  of  the  measurement  and  prediction 
moments.  The  p^.  are  given  by 


p(fcii4  =  + 6at[fci1)]2Pt|Lirt2)  +  [**1)i4r*4)  (3-79) 

Pfc|i5  =  4442)^(p1|L1p1|Li  -P$Li) 

+  12«l*i1)42)(2&*pl|*-irf)  +Pfc|Lirl3)) 

(3.80) 

+  12at[Ar[1)]242)(-A^p^_1ri2)  -  2&*p£:{J_1r£3)) 

+  4[k£WM)  -Wr?) 

p11*6  =  «*^*2)]2(^{(b»*i*-i]3  ~  12p*2*-iP*1*-i  +  6pl|t_i) 

+ 24A*plii-iri2) + 24fctp*iLir*3) + 6p*2Li(rl4)  -  [rt2)i2)) 

+  -  P*|Li)  "  ™2*P$-^3) 

(3.81) 

+ 48/**pliLi([ri2)i2  -  rl4))) 

+  [41)]2(42)]2(6^(p1|Li42)  -  (p12Li1242))  +  24ft3fcp(4Li43) 

+  24*2p<2>_1r<4)  +  6[r<2>]3  -  12r<2)r<4)  +  6r<6)) 
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Pi'll,  =  «lf42)l3(AE(-I2[pW_|]2P^i-l  +  12p1|1-ip1|1-i  ~  <-.) 

+  ,‘i(48pill_1plll_1rl2)  -  48pl|l-iPi2))  +  *l(48lpgi_il27-£3>  -  80p$_|r]i3)) 

+  '■iPitl-i(«0[Pi2)i2  -  60p14>)  +  *iP$-i(«r[2)rf  -  24r[3))) 

+  41)l42)]3(Ai(24p^.1ri2)  -  48pg»_lPSLlrf) 

+  Ai(60pyi_tri3>  -  60[p£fi_1]M*))  +  *tpgl.  .(80rt4)  -  48[r<2>)2) 

+  k»pi2L,(48ri5)  -  48r[2,r[3))  +  12|r<2)J2ri3)  -  12r[2)r[5)  +  4r[7>) 

(3.82) 


pill, = (42)ivt(-3[pi|Lr + 6[Pill-i)2PilLi  -  ^LiPiiLi  +  pIiLi) 

+ -  «pgL,  W + np'&s?) 

+  *i(56pl|l_,r[3)  -  80p$_,p<jL4J,) 

+  ^([pgl.,]2(64trl2)l2  -  Mr‘4>)  +  p$_,(70r<4>  -  54[r<2>]2)) 

+  *!PiiL,(56r[5)  -  80rf  rf)  +  ftipi|l.,(24[r<2)]3  -  48r<2 >r<4)  +  24r<6)) 

-  3[r<2)]4  +  6[rl2»]2rW  -  4r<2V<6>  +  r<») 

(3.83) 


3.5. 1.1  Special  Case  -  Symmetrical  Distributions 


In  the  case  where  the  initial  estimation  error,  the  measurement  noise,  and  the 
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process  noise  all  have  symmetrical  distributions  the  expectations  in  (3.59)  reduce  to 
E[*k\k-iZk]  =  fctfgL, 


E[akzk]  =  0  (3.84) 

E[zk]  =  +  rl2) 

E[a\\  =  -/4[p$-il2  +  hkPk\k-i  +  4/lfcrl2)pl|t_i  “  [42)]2  +  rk4) 


Substituting  into  (3.58)  the  filter  gains  become 

=  ^[**|*-i5*]  E[zl\ 

*i2)  =  0 


(3.85) 


It  is  observed  that  k^  is  now  the  standard  Kalman  filter  gain.  With  =  0  the 
3rd  and  4th  order  components  of  the  2nd  filter  moment  (3.69)  are  zero.  Thus  there 
is  no  need  to  compute  3rd  and  higher  order  filter  moments.  The  resulting  equations 
are  the  standard  Kalman  filter  equations  for  linear  systems. 


3.5.2  Scalar  Symmetrical  Filter 


The  scalar  filter  equations  for  the  symmetrical  non-Gaussian  are  obtained 
in  the  same  manner  as  the  vector  equations  with  the  exception  that  the  moment 
equations  are  not  truncated.  From  (3.38)  the  filter  equation  becomes 

*k|k  =  Zk\k-i  +  (3-86) 


The  filter  error  xk\k  =  xk  —  xk\k  becomes 

*k\k  =  ~  k{k\zl) 


(3.87) 


The  scalar  filter  gains  are  obtained  from  (3.44).  They  are  given  by 


,  (l)  ~  E[xk\k-\zl]E\zk\ 

*  “  E[zl)E[z\\  -  E[z\Y 

,  (3)  £[**|t-izjfc]£[z*]  ~  E[*k\k-\h]E[zl] 

k  mm  -  Einv 


(3.88) 


where 

E[xk\k-ih]  = 

E[%]  =  hkPk\k-i  +  ri2) 

E[xk\k--i~zl)  -  +  3hirj.2)pjtfl_1  (3.89) 

E[4]  =  AiPfc|Li  +  6*M2)p1|L  +  rfc4) 

^(*8  =  ^Pt|Li  +  15^rt2,Pt|Li  +  15*M4)p(4L + rk6) 


The  prediction  moments  are  scalar  versions  of  (3.54)  and  (3.55) 

_(2)  _  i2  (2)  i  _(2)  /n  qa\ 

P*|i-1  ~  Vk-lPk-\\k-l  +  9*-l  (3.9UJ 

P*|iLl  =  <t>k-lPkh\k-l  +  6^-l9fc-lPfc-i|t-l  +  4-1  (3*91) 

The  filter  variance  pj^  =  i?[zj[|t]  can  be  partitioned  into  2n<,,4tfc,  and  6lfc 
order  components  of  the  measurement  and  prediction  moments.  Let 


4|t  “  Pk |t2  +  P{k\kA  +  Pt|fc6 


(3.92) 


where  pj^.  consists  of  itk  order  components  of  the  measurement  and  prediction 
moments.  The  p^.  are  given  by 
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P$2  =  “iPijLl  +  t*£1)]2'"l2)  (3-93) 

p$4 = ^(-a*SL  -  «-SL,*P> + + *1°) 

(3.94) 

p£|i6  =  l43)l2(^pi|L  +  15A|p£!L<-i:2)  +  W^Ll’-f  +  pf )  (3-95) 

where  a*  =  1  —  hkk£\ 

The  4th  moment  =  £[x||t]  can  be  partitioned  into  4th,  6th,  •  •  • ,  12th  order 
components  of  the  measurements  and  prediction  moments.  Let 


Pit*  "  p*l*4  +  P*t*6  +  P*t*8  +  P*l*10  +  P*l*12 


(3.96) 


where  p£jj[.  consists  of  tth  order  components  of  the  measurement  and  prediction 
moments.  The  p^.  are  given  by 


pLii4  =  4pi1i-i + «4l*ll)l2p£jLi,i2)  +  l^M4’ 

p$, = «i*I2,(-4Atpl‘Li  -  i2Atp»iLirt2)) 

+  <.it£1)*i2)(36A^».,rl2)  +  12pgL,rl4)) 

+  “i[*t1>l2*i3 ’(-12*kPl|Llrt2)  -  36AiPtfi_ir’£4>) 

+  [*P>]Hi3>(mIp£|».1r<‘)+4r<‘)) 

pi|i,  =  «I[43,]2(6^p1'Li  +  90Atpl‘i-iri2)  +  90A»Pi|i-irI,)  +  6Pi|Liri6)) 

+  -  240Afpy>_.r£,)  -  72A*p‘*>_1t-£6)) 

+  (41)|2(Ai3)]2(6Ajpl‘i_.rl2)  +  90*{p<]>.1ri4)  +  90*lpi|>.1rl6)  +  fri*’) 

(3.99) 


(3.97) 


(3.98) 
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p*ii10  =  “‘143)i3(-4A2p1“Li  -  144,“iPljLirl2) 

-  504*|p«.irt4)  ~  336Alp£f>_x46)  -  36*w>g_1r<8)) 

(3.100) 

+  41,(43))3(361.|pi'tlri2»  +  336*Jpg>.1’'l4)  +  504^1-.''?  ’ 

+  144A|p^_,ri8)  +  4rj.10)) 

P$12  =  [43)l4('>tM“)-.  +  +  495*lp^i_,r£4) 

(3.101) 

+  924ft<ipg.1’'i8)  +  495MpS-i’-1‘)  +  66*®_,r<'0)  +  r<12>) 

3.5. 2.1  Special  Case  -  Gaussian  Noise 

In  the  case  where  the  process  noise,  the  measurement  noise,  and  the  initial 
conditions  have  a  Gaussian  distribution,  the  symmetrical  filter  equations  reduce  to 
the  standard  Kalman  filter  equations. 


If  the  prediction  error  Xk\k-\  >8  Gaussian,  the  central  moments  of  the  error 
can  be  expressed  in  terms  of  the  variance  as 


(1  X  3  X  -  •  (»  —  l))p]HL  "/2 
0 


n  even 
n  odd 


(3.102) 
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The  expectations  in  (3.89)  now  become 


E[xk\k-i  zk]  =  fctfgL 

E[zj]  = 

£[**|*-l5k]  =  SfcfrgLi  2  +  3**rf  >p«_i 

=  3/4p$_i 2  +  e^rfpgLi  +  3r<2>  2 
=  3£$]2 


E\z\]  =  lSAfpg.r  3  +  45 Ajr^pgLj  2  +  45AH2)  2p^>_1  +  15r<2> 2 


=  15£[z2]3 


Substituting  these  expressions  into  (3.88)  the  gains  become 


(3.103) 


=  -E[x*|fc_iZi]  £[**] 

fcf  =0 


(3.104) 


is  now  the  gain  for  the  standard  Kalman  Filter.  It  is  observed  that  4th  and 
higher  order  moments  are  no  longer  required  for  the  gain  equations.  Thus  it  is  not 
necessary  to  propagate  4th  order  prediction  and  filter  moments.  In  addition,  since 
k^  =  0,  the  4<a  and  higher  order  moment  terms  in  the  2nd  moment  equation  (3.92) 
vanish  and  this  equation  reduces  to  the  filter  moment  equation  for  the  standard 
Kalman  filter. 


3.6  Approximation  of  Prediction  Moments 

Since  the  filter  equation  (3.61)  contains  quadratic  terms  involving  the  in¬ 
novations,  the  nth  order  filter  moment  is  a  function  of  prediction  moments  up  to 


61 


order  2n.  Similarly,  since  the  filter  equation  (3.86)  contains  3r<*  order  functions  of 
the  innovations,  filter  moments  of  order  n  are  functions  of  prediction  moments  up 
to  order  3n.  This  problem  cannot  be  solved  by  simply  computing  the  high  order 
prediction  moments  needed  in  the  filter  moment  equations,  since  these  high  order 
prediction  moments  require  the  availability  of  filter  moments  of  the  same  order. 

One  method  to  deal  with  this  problem  is  to  truncate  the  filter  moment  equa¬ 
tions,  including  only  4th  and  lower  order  prediction  moments.  This  was  done  in  the 
derivation  of  the  vector-based  non-Gaussian  filter  equations.  Another  approach  is 
to  approximate  the  higher  order  prediction  moments.  Using  this  approach  5<A  and 
higher  order  moments  are  approximated  as  functions  of  the  2nrf,  3r<i,  and  4th  order 
prediction  moments. 

Two  approximations  are  considered.  The  first  approximation  was  stated  by 
Rao  and  Yar  [27],  For  symmetrical  distributions  they  approximate  higher  than  4<A 
order  moments  of  a  random  variable  v*  using  the  relation 

f  (n  —  1)  E[vi]  E[v2~2]  n  even 
«]  =  ^  (3-105) 

1  0  n  odd 

This  formula  is  exact  if  v*  has  a  Gaussian  distribution. 

Another  approximation  is  described  in  [74],  pp.  246-258.  This  expansion, 
called  the  Gram-Charlier  series,  approximates  the  density  of  a  non-Gaussian  dis¬ 
tribution  in  terms  of  a  Gaussian  function,  its  derivatives,  and  the  moments  of  the 
original  density  function. 

Let  x  be  a  random  variable  with  mean  p  and  variance  cr  and  arbitrary  density 
function  w(x).  Let  y  be  the  normalized  random  variable  with  nih  moment  vn  defined 
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by 


V  = 


X  —  fl 


(3.106) 


If  (p(y )  represents  a  Gaussian  density  function,  the  Gram-Charlier  series  approxi¬ 
mation  for  the  density  function  w(y)  is  given  by 


w(y)  =  <p(y)  (!  -  ^H3(y)  +  a4#4(y)  -  asHs(y)  +  •  •  •  +  (-1  )nHn{y)  +  •  •  •) 


where  Hn(y)  are  the  Hermite  polynomials  expressed  as 


tf»+i(y)  =  yHn(y)  -  ntfn-i(y) 


(3.107) 


with  Ho(y)  =  1,  Hi(y)  =  y.  The  coefficients  an  are  computed  from  the  Hermite 
polynomials  and  the  density  w(y)  using 

<*n  =  ^  V  /  t i>(y)Hn(y)dy  (3.108) 

n!  J—oo 

It  is  pointed  out  in  [74]  that  when  a  limited  number  of  coefficients  are  available 
specific  groupings  of  coefficients  are  appropriate.  In  particular,  the  Edgeworth  series 
involves  the  groupings 

0,  3 

0,  3,  4,  6 
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The  coefficients  corresponding  to  terms  0, 3, 4,  and  6  are 

ao  =  1 
03  =  — 1/3/3! 

04  =  (va  ~  3)/4! 

06  =  (^6  -  15i/4  +  30)/6! 

The  central  moments  of  the  random  variable  x  are  then  evaluated  using 

E[(x  -  n)n ]  =  [  ( <ry)nw(y)dy  (3.109) 

J—OO 


3.7  Experimental  Evaluation  of  the  Non-Gaussian  Filters 


The  objective  of  this  section  is  to  determine  the  performance  of  the  approxi¬ 
mate  filters  described  in  this  chapter.  In  order  to  measure  performance  it  is  desirable 
to  compare  these  filters  to  optimal  estimators  in  non-Gaussian  noise.  Unfortunately, 
optimal  estimators  do  not  exist  for  arbitrary  probability  distributions.  However,  an 
optimal  estimator  is  available  for  distributions  consisting  of  Gaussian  stuns.  Let  the 
measurement  noise,  process  noise,  and  initial  estimation  error  be  represented  as  a 
sum  of  I  Gaussian  distributions  with  aggregate  density  function 

I 

p(* )  =  £  t%N{x  -  a,-,  Bi)  (3.110) 

i=l 


where 


I 


53  c*  = 

«=1 


ci  >0Vt. 


(3.111) 


Given  a  priori  densities  of  this  form,  the  a  posteriori  densities  of  the  state  given  the 
data  are  determined  by  direct  application  of  Bayes’  rule.  The  resulting  estimator  is 
denoted  the  Gaussian  sum  filter.  The  Gaussian  sum  filter  relations  are  given  in  [10]. 
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The  operation  of  the  Gaussian  sum  filter  can  be  very  computationally  in¬ 
tensive.  For  L  densities  used  to  describe  the  initial  state,  N  densities  for  the  mea¬ 
surement  noise,  and  M  densities  for  the  process  noise;  the  state  prediction  at  the 
first  stage  requires  the  propagation  of  L  *  M  separate  estimates.  The  filtered  es¬ 
timate  for  the  first  stage  requires  L  *  M  *  N  estimates.  In  general,  the  number  of 
separate  prediction  estimates  that  must  be  computed  for  the  kth  stage  is  given  by 
L  *  Mk  *  JV*"1,  and  the  number  of  separate  filtered  estimates  is  L  *  Mk  *  Nk.  The 
Gaussian  sum  representation  essentially  results  in  several  Kalman  filters  operating 
in  parallel.  A  weighted  sum  of  these  filters  is  used  to  form  the  a  posteriori  density. 
The  conditional  mean  is  formed  as  the  convex  combination  of  the  mean  values  of 
the  individual  terms,  or  Kalman  filters,  of  the  Gaussian  sum.  It  is  important  to 
note  that  the  weighting  function  used  to  form  this  convex  combination  is  dependent 
on  the  measurement  data  causing  the  conditional  mean  to  be  a  nonlinear  function 
of  the  data.  In  contrast  to  the  linear  Kalman  filter,  the  conditional  variance  of  the 
Gaussian  sum  filter  is  a  function  of  the  measurement  data.  Thus  the  conditional 
variant  is  not  expected  to  converge  smoothly  as  it  does  with  the  linear  Kalman 
filter.  Additionally,  it  cannot  be  computed  off  line  as  can  be  done  with  the  Kalman 
filter  for  linear  systems. 

The  primary  advantage  of  the  Gaussian  sum  approach  is  that  it  forms  an 
explicit  representation  of  the  a  posteriori  density.  This  representation  is  optimal 
if  the  errors  are  truly  made  up  of  a  sum  of  Gaussian  distributions.  The  major 
disadvantage  of  the  Gaussian  sum  filter  is  the  geometric  progression  of  the  number 
of  separate  Kalman  filters  that  are  required  to  implement  the  estimator.  However, 
the  number  of  filters  can  be  limited  by  disregarding  those  terms  in  the  Gaussian 
sum  that  have  very  small  weights. 
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Our  interest  in  experimental  evaluation  of  the  approximate  filters  is  to  de¬ 
termine  what  degree  of  improvement  these  filters  offer  over  the  standard  Kalman 
filter  and  to  determine  how  close  these  approximate  filters  match  the  performance 
of  the  optimal  estimator  using  Gaussian  sums.  Also  of  interest  is  to  determine  if  the 
truncated  forms  of  the  asymmetrical  and  symmetrical  filter  approximations  given 
in  Sections  3.2  and  3.3  give  similar  performance  to  the  nontruncated  expressions  of 
Sections  3.5.1  and  3.5.2. 


A  scalar  model  is  used  to  evaluate  the  performance  of  the  filters.  The  plant 
and  measurement  equations  for  this  model  have  the  form 


Xjfe  =  0.5xjt-i  +  wk-l 


=  +  Vk 


(3.112) 


where  to*  and  v*  are  mutually  independent,  zero  mean  possibly  non- Gaussian  ran¬ 
dom  processes,  and  the  initial  estimation  error  for  zo  may  also  be  non-Gaussian. 
The  non-Gaussian  distributions  are  modeled  as  the  sum  of  two  Gaussian  distribu¬ 
tions  with  unit  variance.  In  general  the  non-Gaussian  distribution  for  a  random 
variable  y  is  given  by 

/ 

M  =  ]£  1)  (3.113) 

1=1 

where  £*=i  ci  —  1.  For  the  special  case  of  two  distributions  (I  =  2)  the  parameter 
D  is  defined  as  the  separation  between  the  means  of  the  distributions.  In  this  case 
for  p(y)  to  have  zero  mean  m  =  —  C2  *  D,  and  fi2  =  ci  *  D.  If  ei  =  C2»  p(y)  is 
symmetric.  If  D  =  0,  p(y)  is  Gaussian. 
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In  general  the  density  p(y)  has  zero  mean  and  the  next  three  moments  are 

£[y2]  =  SC«'(1  +  ^) 

i=l 

*T»VX>(rf  +  3«i)  (3-114) 

1=1 

E[yA]  =  53  c*(3  +  +  vi ) 

i=i 

3.7.1  Asymmetrical  Filter  Results 

An  asymmetrical  distribution  with  t\  =  0.2,  t2  =  0.8,  and  1  —  2  in  equation 
(3.113).  The  system  represented  by  equations  (3.112)  is  used  to  evaluate  the  different 
filters  for  various  combinations  of  non-Gaussian  process  noise  v*,  measurement  noise 
Wk,  and  initial  estimation  error  xo.  The  three  noise  models  are  given  in  Table  3.1 
below. 


Table  3.1.  Noise  Models  for  Non-Gaussian  Filter  Evaluation 


Model 

Vk 

Wk 

Xo 

1 

eLi  1) 

N(  0,1) 

N(x  o,l) 

2 

JV(0,1) 

E?=i<iW(w.  1) 

N(x0,l) 

3 

N(  0,1) 

Zi=i  +  x0, 1) 

The  asymmetrical  filter  of  Section  3.5.1  is  evaluated  in  three  different  con¬ 
figurations.  In  the  first  configuration,  denoted  Asym/Tl,  the  asymmetrical  filter 
equations  of  Section  3.5.1  are  modified  so  that  the  3r<i  order  filter  moment  contains 
only  functions  of  3rd  order  prediction  moments  and  3r<i  order  measurement  error, 
and  the  4th  order  filter  moment  contains  only  functions  of  4th  order  prediction  mo- 
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ments  and  4th  order  measurement  error.  Thus,  the  terms  and  P^\k6  are 

set  to  zero  in  equation  (3.73),  and  Pt|ts » P*il6  ’  P*t*7  7  Pi]i8  416  set  to  zero  iQ 

equation  (3.78).  The  second  asymmetrical  filter  configuration,  denoted  Asym/T2, 
is  similar  to  Asym/Tl  with  the  exception  that  the  4th  moment  term  pj^  is  retained 
(3.73).  The  vector  formulation  for  this  truncated  filter  configuration  is  developed  in 
Section  3.2.  The  last  asymmetrical  filter  configuration,  denoted  Asym/Edge,  uses 
the  complete  filter  configuration  of  Section  3.5.1  with  5ik  and  higher  order  moments 
being  approximated  by  the  Edgeworth  expansion  using  coefficients  ao  and  03.  The 
second  Edgeworth  expansion,  which  uses  the  terms  ao,  03, 04,  and  as,  cannot  be  used 
because  the  as  coefficient  requires  the  availability  of  the  6th  order  moment.  These 
three  asymmetrical  filters  are  compared  to  the  performance  of  the  standard  Kalman 
filter  and  to  the  Gaussian  sum  filter. 

Figure  3.1  displays  the  non-Gaussian  noise  distribution,  the  state  estimation 
error  x*|*,  and  the  filter  variance  pjj.^  for  a  typical  simulation  of  the  system  in 
equation  (3.112)  for  Model  1.  The  separation  between  the  distributions  for  the 
non-Gaussian  noise  was  D  =  10.  Figure  3.1  illustrates  that  the  asymmetrical  filter 
performance  is  significantly  better  than  the  standard  Kalman  filter,  but  not  as  good 
as  the  Gaussian  sum  filter.  The  three  asymmetrical  filter  configurations  perform 
about  equally.  It  is  observed  that  since  Model  1  is  indeed  a  Gaussian  sum  model, 
the  Gaussian  sum  filter  produces  an  optimal  estimate.  The  primary  purpose  of  this 
test  is  to  compare  the  standard  Kalman  filter  to  the  HOF,  since  both  filters  use 
only  the  error  moments  and  not  the  density  function  to  perform  estimation.  The 
Gaussian  sum  filter  performance  is  included  only  as  a  reference. 


A  Monte-Carlo  analysis  was  performed  to  determine  the  sample  variance  of 
the  estimation  error.  Fifty  separate  simulation  runs  of  the  system  were  made  for 


:al  Asymmetrical  Filter  Results,  Model  1,  D  =  10,  Bimodal  Measurement 
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each  model.  The  estimation  error  was  accumulated  over  the  last  five  samples  of  each 
run  resulting  in  a  total  of  250  samples  of  the  estimation  error.  The  sample  variances 
of  the  filter  error  and  the  prediction  error  are  presented  in  Table  3.2  for 

every  estimator  for  each  model. 


Table  3.2.  Sample  Variances  for  the  Asymmetrical  Filter  -  D  —  10 


Filter 

Model  1 

Model  2 

Model  3 

Type 

wmwmwmmmwnMm 

Kalman 

1.25 

1.35 

0.983 

18.4 

0.575 

1.20 

Asym/Tl 

0.952 

1.29 

0.828 

18.5 

0.575 

1.20 

Asym/T2 

0.954 

1.29 

0.828 

18.5 

0.575 

1.20 

Asym/Edge 

0.944 

1.29 

0.828 

18.5 

0.575 

1.20 

Gaus  Sum 

0.575 

1.20 

0.575 

18.5 

0.575 

1.20 

The  Monte  Carlo  results  are  consistent  with  the  observations  made  from  the 
single  run  results  in  that  the  asymmetrical  filter  performed  very  well  in  relation  to 
the  standard  Kalman  filter.  The  fact  that  the  Asym/Edge  model  gives  the  same 
performance  as  any  of  the  truncated  forms  suggests  that  the  truncated  filters  are 
sufficient  to  characterize  the  asymmetrical  filter.  Although  there  may  be  other 
approximations  for  higher  moments  that  give  better  results  than  the  Edgeworth 
series,  this  suggests  that  there  is  no  need  to  go  through  the  lengthy  and  cumbersome 
vector  expansion  for  5th  and  higher  order  components  for  the  3rd  and  Atk  filter 
moments. 

A  similar  study  was  done  on  the  asymmetrical  filters  for  distribution  D  =  5. 
The  Monte  Carlo  results  for  this  configuration  are  given  in  Table  3.3. 
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Table  3.3  Sample  Variances  for  the  Asymmetrical  Filter  -  D  =  5 


Model  1 

Model  2 

Model  3 

1 

eh 

dWI 

EH 

EH 

EH 

ilH 

Kalman 

1.08 

1.34 

0.910 

5.50 

0.575 

1.20 

Asym/Tl 

0.923 

1.32 

0.823 

5.49 

0.575 

1.20 

Asym/T2 

0.923 

1.32 

0.823 

5.49 

0.575 

1.20 

Asym/Edge 

0.922 

1.32 

0.823 

5.49 

0.575 

1.20 

Gaus  Sum 

0.742 

1.28 

0.752 

5.49 

0.575 

1.20 

As  expected  the  sample  variances  of  the  estimation  error  are  closer  together 
than  they  were  in  Table  3.2  where  D  —  10.  As  the  distribution  separation  D 
approaches  0  all  of  the  results  would  be  the  same  and  all  of  the  estimators  would 
be  optimal. 

3.7.2  Symmetrical  Filter  Results 

A  symmetrical  non-Gaussian  distribution  is  generated  with  parameter  values 
ci  =  0.5,  C2  =  0.5,  and  /  =  2  in  equation  (3.112).  The  system  represented  by 
equation  (3.113)  was  evaluated  for  various  combinations  of  non-Gaussian  process 
noise  v*,  measurement  noise  and  initial  estimation  error  xo  as  expressed  in 
Table  3.1. 

The  symmetrical  filter  was  evaluated  in  three  different  configurations.  In 
the  first  configuration,  denoted  Sym/T,  the  symmetrical  filter  equations  of  Section 

3.5.2  were  modified  so  that  the  second  order  filter  moment  contains  only  functions 
of  2nd  and  4th  order  prediction  moments  find  measurement  moments,  and  the  4th 
order  filter  moment  contains  only  functions  of  4th  order  functions  of  the  prediction 


71 


and  measurement  errors.  Thus  terms  pj^ ,  and  p^g  were  set  to  zero  in  equation 
(3.92),  and  pi]fc65Pi|l8>pi{fc105  and  Pfc|t12  were  ^  *°  zero  *n  e9ua^on  (3.96).  The 
second  symmetrical  filter  configuration,  denoted  Sym/Rao  used  the  complete  filter 
configuration  of  Section  3.5.2.  Moments  of  order  higher  that  4th  were  approximated 
using  the  formula  described  in  equation  (3.105),  which  was  obtained  from  Rao  and 
Yar  [27].  The  last  symmetrical  filter  configuration,  denoted  Sym/Edge,  used  the 
complete  filter  configuration  of  Section  3.5.2  with  moments  of  order  higher  than  4th 
were  approximated  by  the  Edgeworth  expansion  using  coefficients  ao  and  03.  It  is 
noted  that  E[zl\  in  equation  (3.89)  requires  the  availability  of  the  6th  moment  of  the 
prediction  error,  and  the  6th  moment  of  the  measurement  error.  For  the  truncated 
model  Sym/T  the  Rao  approximation  was  used  for  these  moments. 

The  Monte  Carlo  results  for  250  samples  are  given  in  Table  3.4.  This  ta¬ 
ble  shows  that  the  symmetrical  HOF  performs  better  than  the  standard  Kalman 
filter,  but  not  quite  as  good  as  the  optimal  Gaussian  sum  filter.  Among  the  three 
asymmetrical  filter  configurations,  the  truncated  form  and  the  form  that  uses  the 
Rao  approximation  perform  the  same.  However,  the  Sym/Edge  filter  performance  is 
somewhat  poorer  than  the  other  non-Gaussian  filters.  This  discrepancy  is  probably 
due  to  the  fact  that  the  Edgeworth  expansion  is  based  on  a  Gaussian  kernel  and 
this  approximation  degrades  as  the  separation  D  increases.  A  similar  study  was 
performed  for  symmetrical  distributions  with  D  =  5.  The  Monte  Carlo  results  are 
given  in  Table  3.5. 


_  Sample  (k) _  _  _ Sample  (k) 

(b)  Estimation  Error  (c)  Error  Variance  P^f. 

Figure  3.2  Typical  6**  Order  Symmetrical  Filter  Results,  Model  1,  D  =  10,  Bimodal  Measurement  Noise 
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Table  3.4.  Sample  Variances  for  the  Symmetrical  Filter  -  D  =  10 


Filter 

Model  1 

Model  2 

Model  3 

Type 

EH 

EH 

eh 

EH 

pi  2) 

EH 

Kalman 

1.33 

1.35 

0.947 

25.7 

0.575 

1.20 

Sym/T 

1.22 

1.32 

0.908 

25.7 

0.575 

1.20 

Sym/Rao 

1.22 

1.32 

0.908 

25.7 

0.575 

1.20 

Sym/Edge 

1.29 

1.34 

0.933 

25.7 

0.575 

1.20 

Gaus  Sum 

0.575 

1.20 

0.575 

25.6 

0.575 

1.20 

Table  3.5.  Sample  Variances  for  the  Symmetrical  Filter  -  D  =  5 


Filter 

Type 

Model  1 

Model  2 

Model  3 

p(2) 

rk\k 

EH 

pi  2) 

EH 

pi  2) 

EHE 

Kalman 

1.20 

1.35 

0.876 

7.20 

0.575 

1.20 

Sym/T 

1.10 

1.32 

0.838 

7.21 

0.575 

1.20 

Sym/Rao 

1.10 

1.32 

0.838 

7.21 

0.575 

1.20 

Sym/Edge 

1.15 

1.33 

0.856 

7.20 

0.575 

1.20 

Gaus  Sum 

0.914 

1.26 

0.782 

7.21 

0.575 

1.20 

As  with  the  asymmetrical  filter,  as  the  separation  D  becomes  smaller,  the 
sample  variances  of  the  estimation  error  are  clustered  closer  together  for  all  models 
since  as  D  approaches  0  the  mixture  distribution  becomes  “more”  Gaussian.  For 
this  case  the  sample  variance  for  the  Sym/Edge  model  is  now  about  halfway  between 
the  sample  variance  for  the  standard  Kalman  filter  and  the  Sym/T  model,  whereas 
the  sample  variance  for  Sym/Edge  for  D  =  10  was  closer  to  the  Kalman  sample 
variance.  This  is  expected  since  the  non-Gaussian  distribution  is  “more”  Gaussian 
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as  D  becomes  smaller  and  thus  the  Edgeworth  approximation  is  more  valid. 


Better  results  may  be  obtained  for  the  symmetrical  filter  by  propagating 
6th  order  filter  and  prediction  moments.  The  6th  order  prediction  moment  can  be 
obtained  directly  from  the  prediction  error  equation  =  <{>k-\Xk-i\k-i+wk-\- 

The  6th  moment  is  given  by 


If  the  3rd  power  of  the  innovations  in  equation  (3.87)  are  disregarded,  the  6th  order 
filter  moment  can  be  expressed  in  terms  of  only  6**  order  functions  of  the  prediction 
and  measurement  errors.  This  truncation  equation  lead  to 


Pk\l = «*p*|jLi  +  15a***1} Vt]jLirk2) + 15alkk] 4pl|iLirfc4) + kk] 6r*6)  (3-116) 


where  a*  =  1  — 


hk ■ 


In  addition  the  6th  order  terms  are  retained  in  (3.92),  and  pj^  in  (3.96) 


(4) 


With  this  configuration  the  6th  order  symmetrical  filters  was  tested  for  D  = 
5, 10,  and  15  for  a  bimodal  measurement  and  process  noise  distributions.  An  example 
simulation  run  for  bimodal  measurement  noise  (Model  1)  is  shown  in  Figure  3.2. 
Table  3.6  compares  the  results  of  the  linear  Kalman  filter,  the  4th  and  6th  order 
symmetrical  filters,  and  the  Gaussian  sum  filter  for  Model  1.  Included  in  this  figure 
are  the  values  of  the  bimodal  noise  variance  E[v |j.  Define  a  coefficient  of  excels  Tt 
for  a  random  variable  x  as 


r  3S[x2] 
*"  £(*<] 


(3.117) 


r,  represents  the  deviation  of  the  actual  fourth  moment  from  the  Gaussian  fourth 
moment.  This  value  is  also  included  in  the  Table  3.6. 
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Table  3.6.  Comparison  of  Sample  Filter  Variances  for  4th  and  6th  Order  Filters 
Bimodal  Measurement  Noise  Distribution 


D 

r.* 

Filter 

Order 

Sample  Filter  Variances 

Kalman 

Sym/T 

Sym/Rao 

Sym/Edge 

Gaus  Sum 

5 

7.25 

1.98 

4  th 

1.20 

1.10 

1.10 

1.15 

0.914 

6 th 

1.01 

1.01 

1.01 

10 

26 

2.61 

4  th 

1.33 

1.22 

1.22 

1.29 

0.575 

6th 

0.773 

0.824 

0.897 

15 

101 

2.89 

4  ih 

1.34 

1.28 

1.28 

1.32 

0.575 

6th 

0.673 

0.773 

0.880 

Table  3.6  shows  that  as  the  separation  D  increases  the  6th  order  filters  per¬ 
form  significantly  better  than  the  4th  order  filters.  The  truncated  filter,  ’Sym/T’, 
gives  the  best  overall  performance.  In  fact,  for  D  =  15  the  sample  variance  of 
’Sym/T’  (0.673)  is  very  close  to  that  of  the  Gaussian  sum  filter  (0.575).  It  is  ob¬ 
served  that  as  D  increases  from  5  to  10,  rcjk  increases  from  1.98  to  2.89.  Thus, 
as  the  bimodal  distribution  becomes  “more”  non-Gaussian,  the  6th  order  filter  per¬ 
forms  much  better  relative  to  the  linear  Kalman  filter.  It  is  important  to  note  that 
the  high  order  filters  may  become  unstable  under  certain  conditions.  This  is  evident 
in  Figure  3.2(c),  in  which  the  Sym/Edge  configuration  behaves  erratically.  However, 
unlike  the  Gaussian  sum  filter,  the  error  variance  for  the  non-Gaussian  filters 
can  be  evaluated  off-line  without  any  measurement  data  if  the  system  is  linear,  so 
stablity  can  be  determined  before  the  filter  is  implemented  on  actual  data. 

Table  3.7  gives  the  results  for  Model  2,  bimodal  process  noise.  The  results 
are  consistent  with  those  in  Table  3.6. 
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Table  3.7.  Comparison  of  Sample  Filter  Variances  for  4th  and  6th  Order  Filters 

Bimodal  Process  Noise  Distribution 


D 

E[wl] 

^wk 

Filter 

Order 

Sample  Filter  Variances 

Kalman 

Sym/T 

Sym/Rao 

Sym/Edge 

Gaus  Sum 

5 

7.25 

1.98 

0.876 

0.837 

0.837 

0.856 

0.782 

6<a 

0.789 

0.791 

0.794 

10 

26 

2.61 

4 

0.947 

0.908 

0.908 

0.933 

0.575 

6th 

0.684 

0.718 

0.798 

15 

101 

2.89 

4  th 

0.964 

0.939 

0.939 

0.956 

0.575 

6th 

0.643 

0.739 

0.790 

A  commonly  encountered  non-Gaussian  distribution  is  the  so-called  heavy¬ 
tailed  Gaussian  distribution.  This  distribution  is  composed  of  a  large  central  lobe 
and  two  smaller  lobes  separated  by  an  equal  distance  on  each  side  of  the  main  lobe. 
To  generate  this  distribution  /  =  3,  t\  =  €3  =  0.2,  £2  =  0.6,  fi\  =  —D/2,fi2  = 
0,/i3  =  D/2  were  used  in  equation  (3.113).  Figure  3.3  displays  the  non-Gaussian 
noise  distribution,  the  estimation  error  x*|*»  and  the  filter  variance  for  a  typical 
simulation  of  the  system  in  equation  (3.112)  for  model  1.  The  separation  between 
the  distributions  for  the  non-Gaussian  noise  was  D  =  10.  Tables  3.8  and  3.9  compare 
the  performance  of  the  non-Gaussian  filters  to  the  Kalman  and  Gaussian  sum  filters 
for  Models  1  and  2  respectively. 

Tables  3.8  and  3.9  show  that  the  Gaussian  sum  filters  do  not  offer  any  sig¬ 
nificant  improvement  over  the  Kalman  filter  for  the  heavy-tailed  distributions  used 
for  these  examples.  For  the  heavy-tailed  distributions  TVk  and  TWk  are  approxi¬ 
mately  equal  to  1.  This  again  demonstrates  that  the  relative  performance  of  the 
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Table  3.8.  Comparison  of  Sample  Filter  Variances  for  4th  and  6th  Order  Filters 
Heavy-Tailed  Measurement  Noise  Distribution 


Sample  Filter  Variances 


Kalman  Sym/T  Sym/Rao  Sym/Edge  Gaus  Sum 


.967 


D 

£[r|] 

TVk 

5 

3.5 

1.10 

10 

11 

1.16 

15 

23.5 

1.18 

0.964  0.966 


0.967 


1.20  1.20 


1.21 


1.27  1.27 


1.28 


0.967 


1.20 


1.21 


1.27 


1.28 


Table  3.9.  Comparison  of  Sample  Filter  Variances  for  4th  and  6lfc  Order  Filters 
Heavy-Tailed  Process  Noise  Distribution 
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non-Gaussian  filters  can  be  assessed  by  comparing  the  fourth  moment  to  that  of  a 
Gaussian  distribution.  If  the  coefficient  of  excess  is  close  to  1,  then  the  standard 
Kalman  filter  gives  similar  performance  to  the  symmetrical  high  order  filter. 

The  last  non-Gaussian  distribution  considered  is  the  uniform  distribution. 
Figure  3.4  displays  the  noise  distribution,  the  estimation  error  and  the  filter 
variance  p^k  for  a  typical  simulation  of  the  system  in  equation  (3.112)  for  uniform 
measurement  noise  with  variance  equal  to  10.  Tables  3.10  and  3.11  compare  the 
performance  of  the  non-Gaussian  filters  to  the  Kalman  and  filters  for  uniformly 
distributed  measurement  noise  (Gaussian  process  noise  and  initial  estimation  error 
with  unit  variance),  and  uniformly  distributed  process  noise  (Gaussian  measurement 
noise  and  initial  estimation  error  with  unit  variance),  respectively. 

Table  3.10.  Comparison  of  Sample  Filter  Variances  for  4th  and  6th  Order  Filters 
Uniform  Measurement  Noise  Distribution 


EH\ 

r.» 

5 

1.67 

10 

1.67 

15 

1.67 

Filter 
Order  I  Kalman 


Sample  Filter  Variances 


Sym/T  Sym/Rao  Sym/Edge 


.868 


.825 


.984 


.912 


1.03 


0.963 


J0JJ3  a] bis 


Symmetrical  Filter  Results,  Model  1,  D  =  10,  Uniform  Measurement  Noise 
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Table  3.11.  Comparison  of  Sample  Filter  Variances  for  4th  and  6th  Order  Filters 

Uniform  Process  Noise  Distribution 


eH\ 

r«* 

■  /  | 

Sample  Filter  Variances 

Kalman 

Sym/T 

Sym/Rao 

Sym/Edge 

5 

1.67 

4  th 

0.866 

0.838 

MEM 

B 

6th 

0.815 

0.815 

B 

10 

1.67 

IQI  | 

0.932 

0.910 

msm 

0.919 

6th 

0.876 

mem 

mm 

15 

1.67 

4th 

0.955 

0.937 

0.937 

0.945 

6 th 

0.902 

B 

0.903 

The  non-Gaussian  filters  give  better  performance  than  the  Kalman  filter 
for  uniformly  distributed  noise.  Again  the  relative  performance  is  related  to  the 
difference  between  the  fourth  moment  -EfuJ]  and  the  fourth  moment  of  the  Gaussian 
distribution  3 E[v%]2. 

3.8  Conclusion 

Two  approximate  methods  for  filtering  have  been  presented  for  estimation 
in  the  presence  of  asymmetric  and  symmetric  distributions  of  non-Gaussian  noise. 
Simulation  studies  have  shown  that  the  HOFs  can  perform  very  well  for  estimation 
in  non-Gaussian  noise.  The  real  utility  of  the  filters  developed  in  this  chapter  comes 
when  either  the  noise  cannot  be  adequately  represented  as  Gaussian  sums,  or  when 
only  the  moments  of  the  noise  are  known,  and  not  the  actual  density  functions. 
Although  these  filters  are  more  complicated  to  implement  than  the  standard  Kalman 
filter,  they  are  not  nearly  as  computationally  intensive  as  the  Gaussian  sum  filter 
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in  which  the  number  of  parallel  filters  grows  geometrically  as  the  number  of  stages 
increase. 

An  obvious  method  to  improve  the  performance  of  the  non-Gaussian  filters 
developed  here  is  to  use  higher  powers  of  the  innovations  in  developing  the  filter 
equations.  That  is,  let  I  be  greater  than  3  in 

Xfc|i  =  *k\k-i  -  £  K{kl)zf. 
i=0 

However,  this  would  make  the  vector  derivation  of  the  filter  equations  very  unwieldy. 
In  addition  this  derivation  would  require  the  availability  of  still  higher  order  terms 
in  the  solution  of  the  filter  variance  equations.  That  is  for  /  =  4  the  filter  variance 
equation  would  require  up  to  St>l  order  prediction  moments.  The  4<A  order  moment  of 
the  variance  would  require  up  to  16th  order  prediction  moments.  In  general,  when 
/  >  1  it  is  necessary  to  either  truncate  the  expressions  for  the  filter  moments  so 
that  only  those  powers  of  prediction  and  measurement  error  moments  are  included 
for  which  similar  powers  of  the  filter  moments  exist,  or  the  higher  powers  of  the 
prediction  and  measurement  error  moments  must  be  approximated. 

For  non-Gaussian  distributions  made  up  of  known  Gaussian  sums,  the  non- 
Gaussian  filters  presented  here  give  a  reasonable  compromise  between  the  optimal 
but  very  computationally  intensive  Gaussian  sum  filter,  and  the  suboptimal  but 
easily  implemented  standard  Kalman  filter.  In  addition,  when  only  the  moments 
of  the  distributions  are  known  and  a  Gaussian  sum  filter  cannot  be  used,  the  non- 
Gaussian  filters  offer  a  means  to  obtain  improved  performance  over  the  standard 
Kalman  filter. 
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Chapter  4 

Nonlinear  Filtering  Methods  for  Harmonic  Retrieval 

This  chapter  addresses  the  problem  of  high  resolution  parameter  estimation 
of  superimposed  sinusoids  using  nonlinear  filtering  techniques.  Six  separate  nonlin¬ 
ear  filters  are  evaluated  for  the  estimation  of  the  parameters  of  sinusoids  in  white  and 
colored  Gaussian  noise.  Experimental  evaluation  demonstrates  that  the  nonlinear 
filters  perform  very  satisfactorily  (close  to  the  Cramer-Rao  bound)  for  reasonable 
values  of  the  initial  estimation  error.  A  major  advantage  of  using  nonlinear  filtering 
methods  for  harmonic  retrieval  is  that  the  filters  can  be  applied  to  time  varying 
process  models  as  well. 

Some  of  the  more  recent  work  done  in  parametric  methods  for  harmonic 
superesolution  includes  modified  singular  value  deconrpostion  techniques  [31],  and 
cumulant  based  techniques  [32].  Generally  these  approaches  perform  well  at  high 
SNR’s,  with  close  correspondence  to  the  CR  bound.  However,  the  performance  is 
severely  degraded  at  low  SNR’s. 

Solution  of  the  harmonic  retrieval  problem  is  approached  using  3  nonlinear 
filters  including  the  extended  Kalman  filter,  the  Gaussian  second  order  filter,  and 
the  minimum  variance  filter.  Three  iterated  forms  of  the  extended  Kalman  filter 
are  also  applied  to  this  problem.  The  main  advantage  of  using  recursive  filtering 
techniques  over  more  traditional  batch-type  estimators  is  that  time  varying  system 
parameters  can  be  modeled.  The  nonlinear  filters  are  applied  to  data  consisting  of 
two  exponentially  damped  sinusoids  in  white  noise.  The  results  are  compared  to  the 
Cramer-Rao  (CR)  bound  and  to  results  obtained  by  other  authors  using  singular 
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value  decomposition  (SVD)  techniques.  The  performance  of  the  nonlinear  estimators 
is  also  evaluated  in  colored  noise  with  knovrn  and  unknown  noise  filter  coefficients. 
In  addition,  a  technique  is  presented  to  perform  estimation  when  the  noise  statistics 
are  unknown.  In  this  case  the  noise  statistics  are  estimated  along  with  the  state 
estimates.  Overall  it  is  found  that  the  nonlinear  filters  give  performance  very  close 
to  the  CR  bound  whenever  the  initial  state  covariance  is  small.  The  techniques  are 
found  to  be  very  effective  in  colored  noise  with  known  and  unknown  coefficients, 
and  when  the  noise  statistics  are  unknown. 

The  problem  of  high  resolution  frequency  estimation  has  received  a  consid¬ 
erable  amount  of  attention  in  recent  years.  The  classical  method  for  frequency 
estimation  is  based  on  nonpar~metric  periodogram  estimates  and  their  variations 
[31].  The  frequency  estimates  are  formed  from  the  power  spectral  density  estimates 
obtained  from  the  Fourier  transform  of  the  data  or  from  the  Fourier  transform  of  the 
autocorrelation  sequence.  The  frequency  resolution  of  these  techniques  is  directly 
related  to  the  number  of  data  samples  in  the  received  time  series,  so  the  periodogram 
method  is  generally  not  considered  a  high  resolution  frequency  estimator.  In  a  re¬ 
lated  technique  developed  by  Capon  [32],  called  maximum  likelihood  (ML)  spectral 
estimation,  the  PSD  is  estimated  by  effectively  measuring  the  power  cut  of  a  set 
of  narrowband  filters.  Foias  et  a J.  [33]  have  demonstrated  that  the  ML  estimate 
converges  monotonically  to  the  point  power  spectrum  associated  with  the  sinusoids 
as  the  number  of  correlation  lags  approaches  infinity.  They  further  show  how  this 
convergence  property  can  be  used  to  determine  whether  a  strong  spectral  peak  cor¬ 
responds  to  a  sinusoid.  A  complete  review  of  all  of  the  basic  spectrum  estimation 
techniques  is  presented  by  Kay  and  Marple  [34]. 

Parametric  methods  attempt  to  fit  some  assumed  model  to  the  data.  Using 
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parametric  methods,  the  problem  becomes  one  of  choosing  a  proper  model  and  esti¬ 
mating  the  parameters  of  the  assumed  model.  Most  of  these  models  can  be  classified 
into  autoregressive  (AR)  or  autoregressive  moving  average  (ARMA)  models.  Some 
of  the  early  work  on  these  models  was  done  by  Ulrych  and  Clayton  [35]  for  the  (AR) 
model,  and  Cadzow  [36]  for  the  ARMA  model.  Ulrych  and  Clayton  use  a  modified 
covariance  technique  in  which  the  sum  of  the  squared  forward  and  backward  resid¬ 
uals  is  minimized.  Lang  and  McClellan  [37]  have  shown  that  the  variance  of  the 
spectral  estimate  obtained  using  this  approach  is  smaller  than  that  of  the  covariance 
method.  Citron  et  a l.  [38]  compare  Ulrych  and  Clayton’s  method  with  Cadzow’s 
method  and  find  that  Ulrych  and  Clayton’s  technique  appears  to  perform  better. 
Tufts  and  Kumaresan  [39  -  43]  and  Hua  [44]  have  enhanced  the  performance  of  AR 
models  by  using  singular  value  decomposition  (SVD)  and  a  reduced  rank  approxi¬ 
mation.  They  show  that  this  technique  has  close  correspondence  to  the  CR  bound 
at  high  SNR’s.  However,  Bresler  and  Macovski  [45]  point  out  that  the  performance 
of  most  modem  high  resolution  spectral  estimation  methods  is  severely  degraded  at 
low  SNR’s  and/or  short  data  lengths.  They  postulate  that  this  is  due  to  the  fact  that 
these  techniques  are  heuristic  least  squares  modifications  of  algorithms  that  yield 
exact  results  when  there  is  no  noise  or  when  the  available  data  is  infinite  (known 
covariance  case).  Other  methods  include  adaptive  notch  filtering  [46],  adaptive  line 
enhancement  [47],  and  pencil  of  function  methods  [48  -  50]. 

A  relatively  new  method  used  for  harmonic  retrieval  involves  the  use  of  higher 
order  statistics  [51,52].  For  non-Gaussian  signal  components  and  Gaussian  noise  the 
third  order  moment  and  fourth  order  cumulant  of  the  measurements  will  theoreti¬ 
cally  only  contain  signal  components  and  accurate  estimates  for  the  signal  frequency 
components  can  be  obtained.  Papadopoulos  and  Nikias  [52]  show  that  by  using  these 
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methods  they  can  match  the  performance  of  the  Kumaresan  and  Tufts  methods  in 
white  Gaussian  noise  and  perform  better  in  colored  Gaussian  noise.  However,  the 
high  order  statistics  methods  show  severely  degraded  performance  at  low  SNR’s. 
Arun  and  Aung  [57]  have  proposed  a  SVD  based  approach  for  tracking  the  param¬ 
eters  of  sinusoids  with  time  varying  parameters. 

Other  algorithms  have  been  shown  to  give  good  performance  at  low  SNR’s. 
Chan  et  al.  [53]  developed  recursive  expressions  for  the  estimation  of  m  sinusoids 
in  white  noise  using  2m  coefficients  of  an  ARMA  model.  Friedlander  [54]  developed 
a  recursive  algorithm  for  maximum  likelihood  ARMA  spectral  estimation.  The 
iterative  inverse  filtering  method  [55]  is  shown  to  produce  accurate  estimates  of 
unknown  frequencies  at  low  SNR’s  and  a  small  number  of  points. 

Stankovic  et  al.  [56]  uses  the  extended  Kalman  filter  for  the  estimation  of  the 
frequencies  of  sinusoids  in  white  and  colored  noise.  They  use  an  ARMA  model  for 
the  signal  and  the  noise.  Thus,  they  need  to  estimate  2  variables  for  each  frequency. 
For  good  initial  parameter  estimates,  the  EKF  method  outperforms  the  maximum 
likelihood  method. 

In  this  chapter  nonlinear  filtering  techniques  are  employed  to  estimate  the 
parameters  of  exponentially  damped  sinusoids  in  colored  noise.  A  direct  model  is 
used.  This  model  requires  one  state  variable  for  each  parameter  to  be  estimated. 
That  is,  the  state  variables  are  the  frequencies,  phases,  damping  coefficients,  and 
amplitudes  of  the  sinusoids.  Using  this  model  time  varying  characteristics  of  the 
state  can  be  explicitly  evaluated.  A  comparison  is  made  among  the  EKF,  the  it¬ 
erated  filters,  and  the  minimum  variance  filter  for  this  problem.  Various  filters  are 
evaluated  in  order  to  determine  the  estimator  that  is  least  susceptible  to  the  impact 
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of  the  initial  state  covariance  on  the  performance  of  the  algorithms.  The  results  are 
compared  to  the  CR  bound  and  to  the  SVD  methods  of  Kumaresan  and  Tufts. 

This  chapter  is  organized  as  follows:  Section  4.1  describes  the  general  system 
model  used  for  all  of  the  filters  for  estimating  time  varying  amplitudes,  frequencies, 
damping  coefficients,  and  phases  for  a  known  number  of  sinusoids  in  additive  white 
Gaussian  noise.  Section  4.2  presents  the  nonlinear  filtering  equations  for  the  general 
system  model  and  the  details  of  the  implementation  for  the  specific  model  involving 
the  two  exponentially  damped  sinusoids  in  white  noise.  This  section  discusses  the 
extended  Kalman  filter,  the  Gaussian  second  order  filter,  the  minimum  variance 
filter,  and  three  iterated  forms  of  the  extended  Kalman  filter.  Experimental  results 
obtained  from  Monte  Carlo  simulations  demonstrate  the  performance  of  these  fil¬ 
ters.  Section  4.3  presents  the  extended  Kalman  filter  expressions  and  simulation 
results  for  harmonic  retrieval  in  colored  noise  with  known  and  unknown  noise  filter 
coefficients.  A  technique  for  estimation  of  the  measurement  covariance  is  described 
and  experimentally  analyzed  in  section  4.4. 

4.1  General  System  Model 

Consider  the  problem  of  estimating  the  parameters  of  P  sinusoids  from  K 
measurements.  The  complex  scalar  measurement  model  is  given  by 

P 

*k  =  2  ch  exP  (~a*j >k  +  j(ukpk  +  9kp))  +  vk  (4.1) 

p=i 

for  it  =  0, 1,  •  •  • ,  K  —  1.  vk  is  assumed  to  be  complex  white  Gaussian  noise 
with  mutually  independent  real  and  imaginary  components  each  with  variance  a1. 
It  is  assumed  that  the  frequencies  ukp  are  normalized  so  that  the  effective  sampling 
interval  is  one  second.  The  2-element  measurement  model  zk  from  (4.1)  can  be 
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where  w*  /%/  N(0,Qk). 

It  is  convenient  and  straightforward  to  treat  the  amplitudes  (c*p),  damping 
coefficients  (atp),  frequencies  (w*p),  and  phases  ( 0*p )  as  state  variables.  This  permits 
the  time  dependence  of  the  signal  parameters  to  be  directly  modeled  through  the 
process  equation  (4.5).  An  alternative  approach  is  to  model  the  differential  equation 
for  the  measurements  through  the  process  equation.  For  the  case  of  continuous  time 
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real  sinusoids  let 

y(f)  =  c(f)exp(— a(f)t)sin(u>(t)t  +  0(f)) 


then 

y(f)  =  —  a(f)exp(— a(f)f)sin(u>(f)f  +  0(f))  +  u>(f)exp(— a(f)f)cos(u>(f)f  +  0(f)) 
y(t)  =  (a2(f)  -  u;2(f))exp(— a(f)f)sin(u>(f)f  +  0(f)) 

—  2a(f)w(f)exp(— a(f)f)cos(u>(f)f  +  0(f)) 


and  the  process  equation  can  be  expressed  as 

0  1 
+  —2a(t)  J 


y(0  = 


w 

.y(0. 

y(0 

Ly(<)J 


+  q(t) 


where  q(t)  is  the  process  noise.  The  advantage  of  this  formulation  is  that  the  mea¬ 
surement  equation  z(t)  —  y(t)  +  t/(<),  where  v(t)  is  the  process  noise,  is  a  linear 
function  of  the  state.  However,  there  are  several  disadvantages.  The  primary  disad¬ 
vantage  is  determining  the  initial  conditions.  Since  the  process  equation  may  contain 
unknown  parameters  c(t),a(t),u?(t),and  0(f),  reasonably  small  error  in  the  initial 
estimates  of  these  parameters  may  lead  to  very  poor  initial  estimates  for  y(0)  and 
y(0)  causing  the  solution  to  converge  to  harmonics  of  the  actual  frequency  or  other 
poor  filter  performance.  In  addition,  to  express  time  dependence  and  initial  uncer¬ 
tainty  of  the  unknown  parameters  these  parameters  must  also  be  modeled  as  state 
variables,  thus  further  complicating  the  process  equation.  Even  in  the  case  where 
the  unknown  parameters  are  constant,  the  process  equation  is  nonlinear.  With  these 
considerations  it  was  decided  that  the  models  given  by  (4.5)  and  (4.2)  were  more 
appropriate  and  convenient  for  the  harmonic  retrieval  problem. 
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4.1.1  Estimating  the  Parameters  of  2  Sinusoids 

A  particular  measurement  model  of  interest  consists  of  two  exponentially 
damped  sinusoids  in  white  noise.  This  model  has  been  analyzed  by  Kumaresan  and 
Tufts  [41]  using  reduced  rank  SVD  techniques,  and  by  Papadopoulos  and  Nikias  [52] 
using  cumulants. 


For  this  model  the  measurement  equation  (4.2)  becomes 

z*  =  ht(xi)  +  v* 

’  exp  (~akpk)  sin(u >kpk) ' 

=  +  v*. 

.Ep=i  exp  ( -akpk )  cos(a >kpk) . 


(4.6) 


The  associated  state  variables  are  defined  as 

*jfcj  = 

Xfc,  =  oi 

(4.7) 

Xfc3  =W2 
Xjfe4  =  Ct2 

Assuming  constant  frequencies  and  damping  coefficients,  the  plant  equation  that 
describes  the  evolution  of  the  states  (4.7)  is  given  by 


*kp' 


(4.8) 


4.2  Nonlinear  Filters  for  Harmonic  Retrieval 

This  section  presents  the  equations  used  for  the  extended  Kalman  filter 
(EKF),  the  Gaussian  second  order(GSO)  filter,  the  minimum  variance  filter  (MVF), 
and  three  iterated  forms  of  the  extended  Kalman  filter  for  the  harmonic  retrieval 
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problem.  The  EKF  and  GSO  equations  are  approximate  solutions  to  the  nonlinear 
estimation  problem.  The  EKF  requires  the  first  order  Taylor  series  expansion  of  the 
measurement  about  the  latest  estimate,  and  the  GSO  filter  requires  second  order 
expansions.  The  MVF  equations  are  exact  expressions  of  the  mean  squared  error 
state  estimates  for  exponential  nonlinearities  in  Gaussian  noise.  The  three  iterated 
filters  are  extensions  of  the  EKF  equations.  The  process  model  presented  in  (4.5) 
is  appropriate  for  the  general  case  of  time  varying  parameters.  However,  in  this 
chapter  the  state  variables  are  restricted  to  be  time  invariant. 

4.2.1  Extended  Kalman  Filter 

The  EKF  is  obtained  by  making  Gaussian  assumptions  about  the  a  posteriori 
densities  and  by  extending  the  plant  and  measurement  nonlinearities  in  a  Taylor 
series  including  first  order  terms.  The  extended  Kalman  filter  equations  for  time 
invariant  states  [6]  (p.  195)  are  given  by 

Kk  =  +  Ht)-' 

=  (/.  -  KtHk)Pk-Uk-i 

(4-9) 

X*|*  =  X*_i|*_j  +  Kk  it 
it  =  zt-  hfctxfci*.!) 

where  Xj.|*_i  is  predicted  estimate,  Pt\t-i  is  the  one-step  prediction  covariance, 
Kk  is  the  filter  gain,  x*|*  is  the  filtered  estimate,  and  Pk\t  is  the  filter  covariance. 
In  is  the  n-dimensional  identity  matrix.  The  filter  requires  the  initial  conditions 
£[xo]  =  xo  and  E[(xo  -  xo)  (xo  -  xq)t]  =  Po- 
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For  the  measurement  equation 
_  flhfc(xfc) 


dxi  'x*=**lfc-l 
e  *fc2  fccos(it1  k) 

— e  Xfc2  tsin(xjt1  k) ' 

— e  *fc2  fcsin(xi1  k) 

— e  **2  kcos(xkl  k) 

e  kcos(xk3  k) 

—t  Ifc4  *sin(x*3  k) 

— e  fcsin(a:fc3  k ) 

—e  *k4kcos (xk3k) 

(4.10) 


Jxi=**-l|*-l 


4.2.2  Gaussian  Second  Order  Filter 

The  Gaussian  second  order  (GSO)  filter  relations  [6]  are  obtained  by  including 
second  order  terms  in  the  Taylor  series  expansions  of  the  plant  and  measurement 
equations.  The  constant  state  model  of  (4.4)  leads  to  the  GSO  filter  equations 

Kk  =  +  Rt  +  Bt)-' 

*j|t  =  +  KtU  (4.11) 

Pk\k  =  V.~KkHk)Pk-l\k-l 


where 


and 


zjk  =  zi  -  ^  a2(ht,  Pfc-m-i)! 

2  lx*-xJfc|k-l 


(4.12) 


a2(h»,i’t-i|*-,)  =  tr«e|[^-  (4.13) 


The  bracketed  quantity  in  (4.13)  is  a  matrix  whose  pqih  element  is  the  quantity 
d2hk. 

<fx~8x~’  The  matrix  and  Bk  is  approximated  by 


Bk.  w  - 
*j  4 


62hk.  d2ht. 

E  q"  fa~(dpmdqn  +  dpndqm)'  *" 


p,q,m,n 


dxm  dxn 


(4.14) 


Jx*=**|*-i 
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where  hk.  denotes  the  iih 


element  of  h*(.),  and  the  d’s  are  elements  of  Pk-i\k-i- 


For  the  model  (4.8) 


A* 

dxtdx'l  0 


d2hh(xk)  _  2 
dxkdx£ 


0 

e~r*4  kDh. 
0 

e~Xk4kEh. 


where 


^  = 


— sin(xjt|.^r) 
[-cos  (xk.k) 


'-cos  (xkik) 
.  sin (xkik) 


-cos  (xk{k) 
sin(iA.fc) 
sin(xjt.*:)' 
cos(x^Ar) 


(4.15) 


(4.16) 


*‘1*- = + 1 
fltn-l  —  +  fi-iQi-l'T-]  +  ^fc-l 

*t|t  =  **n-i  +  Kk%k  '4'17) 

%  =  (/.  -  KtHt)Pk\k-i 
Kk  =  Pk\k-kHl(HkPk\k-kHk  +  Rt  +  BkT1. 

4.2.3  Minimum  Variance  Filter 

The  EKF  and  the  GSO  filters  are  based  on  a  Taylor  series  expansion  of 
the  nonlinear  equations  about  the  most  recent  estimate.  As  such  the  EKF  and 
GSO  filters  are  subject  to  the  inherent  problems  of  local  linearizations  and  may 
lead  to  poor  performance.  Liang  [23]  developed  a  minimum  variance  filter  (MVF) 
which  gives  exact  estimates  at  each  iteration  of  the  filter  based  on  the  assumption 
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that  the  estimation  errors  are  Gaussian.  He  has  shown  that  for  certain  nonlinear 
functions  such  as  polynomial  nonlinearities,  exponential  functions,  and  sinusoids, 
exact  expressions  for  the  state  estimates  can  be  obtained  and  used  in  the  filter 
relations  in  place  of  the  usual  approximations.  At  each  step  in  the  operation  they 
assume  that  the  prediction  and  filter  errors  are  Gaussian.  They  have  compared  their 
filter  to  the  EKF  and  other  filters  using  numerical  examples  and  claim  that  their 
filter  performs  much  better  than  the  EKF  for  large  initial  error  variances.  Using 
the  plant  and  measurement  models  in  (4.5)  and  (4.6)  the  minimum  variance  filter 
relations  are  given  by 

Kk  =  -£[xt|i_1h(t(x*)T]  ( Rk  +  £[hi(xi)h*(xi)T])~1 
*k\k  =  **-i|*-i  +  Kkzk  =  zk-  h*(x*)  (4.18) 

pklk  =  pk-i\k-i  -  Kk  £[h*(x*)xjj*_j] 

where 

x*|k-l  =  x*  -  x*|*-l 

h*(x*)  =  h*(x*)  -  hi(xfc)  (4.19) 

fi-^Xfc-i)  =  fjfc^Xjfc-!)  -  ffc-l(x*_!). 

The  MVF  requires  exact  analytical  expressions  for  2?[h*(x*)],  J?[xthfc(xt)T],  and 
E[hk{xk)hk(xk)T]. 

The  general  system  model  (4.2)  requires  closed  form  relations  for  functions 
of  the  form  £[exp(y*)],  ^[xt|.exp(yjfc)J,  and  E[xki  xk.  exp(yi)],  where  yk  is  defined 
by  the  inner  product 

yt  =  u* x* 

and  where  nk  is  a  vector  of  deterministic  coefficients.  Note  that  if  x*  is  a  vector 
of  jointly  Gaussian  variables,  then  yk  is  also  Gaussian.  Liang  [24]  derives  relations 


for  expectations  of  this  form.  The  following  relations  can  be  used  to  evaluate  these 
expectations  for  tl.o  general  system  model  given  in  (4.2)  : 


£[exp(yt)]  =  exp(uj’xfcjfc_1  +  ^uj[>t|k_1uk) 
E[xkiexp{yk)\  =  (**1*-^  +  u*Pjt|*_ie;) 

x  exptujE'xii*-!  +  ^urPt|fc-iUjfc) 

E[xki  xkj  exp(yt)]  =  (efP^^ej  +  xt^.x^^. 


(4.20) 


+  ui  pk\k-i  e>u[  Pk\k^ei) 
x  exp(uj'xt|t_1  +  ^ulPk]k^uk) 

where  e,  is  the  ith  unit  vector.  This  vector  is  zero  except  for  the  ith  element. 


For  the  measurement  equation  in  (4.6),  in  which  the  amplitudes  of  the  sinu¬ 
soids  ckp  are  known,  only  the  first  two  expressions  in  (4.20)  are  necessary. 

4.2.3. 1  Evaluation  of  £[ht] 


Let  the  measurement  nonlinearity  from  (4.6)  be  expressed  as 
r  Im  ( exp(uf  xk)  +  exp(u^xk) )  * 

hi  = 

.  Rc  ( exp(u  j  xk)  +  exp(u$  xk) ) . 


(4.21) 


where 

uj;  -  i  jk  -k  o  o  ] 

uf2  =  (  0  0  jk  -k  ] 


(4.22) 
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Define  the  quantities 

A*j  =  exp(u[1xijt_1  + 

hh  =  exp^x*^,  +  \nl2Pk^ul2) 


(4.23) 


Using  (4.23)  and  (4.20)  the  expression  for  the  expected  value  of  the  measurements 
for  the  model  in  (4.6)  is  obtained  as 


E[  htl  = 


'Im(htl  +  hfc2)‘ 

.Re(  hkl  + 


(4.24) 


4.2.3.2  Evaluation  of  E[x.k 


For  the  model  in  (4.6),  E[xjfchjjT]  can  be  found  from  (4.20)  using 
Ct  =  E[xfc  (exp(u^xt)  4-  exp(ujf2x*)] 

=  (*jfc|*-i  +  +  (x4|*-i  +  uj l2Pk\k-i)hk2 

which  gives 

E[xih[]  =  [Im(Ci) !  Re(C4)] 


(4.25) 


(4.26) 


4.2.3.3  Evaluation  of  E[hk  ht]r 


E[hjt  hjf]  for  the  model  in  (4.6)  is  evaluated  using  the  first  equation  in  (4.20). 

,  *  * 

A*n  h 


E[hkhi\=  ;n  12 

.  hk12  J 


(4.27) 


where 

hku  =Re{[l  -'12-2  1  1]  Efc/2  > 

hkl2  =lm{[0  1  0  2  0  1]  Ejt/2 }  (4.28) 

h22  =Re{(l  1  2  2  1  l]Ejfc/2} 
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The  iih  element  of  the  column  vector  is  defined  as 
Eki  =  E[exp(Ut.xt] 

=  +  UtjPt|l_1Uj’./2) 


(4.29) 


and  where  U^.  is  the  ith  of  the  matrix  Uk,  which  is  given  by 


Uk  =  k 


0 

-2 

0 

0 

2 j 

-2 

0 

0 

3 

-1 

-3 

-1 

3 

-1 

3 

-1 

0 

0 

0 

-2 

0 

0 

2j 

-2 

(4.30) 


The  quantities  2?[h*],  £[x*  hjf],  and  JS[hjt  h*],  found  from  (4.24),  (4.26), 
and  (4.27),  respectively,  are  used  in  (4.18)  to  form  the  minimum  variance  filter 
expressions. 

4.2.4  Iterated  Filters 

Three  iterated  forms  of  the  extended  Kalman  filter  are  also  used  for  param¬ 
eter  estimation.  The  iterated  filters  can  be  categorized  into  two  classes.  Locally 
iterated  filters  are  implemented  by  continuously  processing  the  data  for  a  given 
measurement  (i.e.  for  a  given  value  of  Jt)  until  the  error  between  iterations  is  mini¬ 
mized  or  until  a  maximum  iteration  count  is  exceeded.  Globally  iterated  filters  are 
implemented  by  processing  the  entire  data  set  more  than  once  essentially  recycling 
the  data  set  through  the  filter. 
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4. 2. 4.1  Locally  Iterated  Extended  Kalman  Filter 

The  locally  iterated  Kalman  filter  (LIKF)  is  an  enhanced  version  of  the  ex¬ 
tended  Kalman  filter  where,  at  each  step  of  the  iteration  procedure,  the  measurement 
nonlinearity  is  linearized  about  the  state  estimate  obtained  from  the  EKF  equations. 
This  filter  was  first  introduced  by  Denham  and  Pines  [21].  The  procedure  is  to  repet¬ 
itively  calculate  A*,  and  P^,  each  time  by  linearizing  about  the  most  recent 
estimate.  The  recursion  relations  for  the  LIKF  are  given  by  [6]  (p.  190) 

**!*('  +  1)  =  +  Xt(i)[Zk  ~  M**|t(0)  -  -  *»| *(*))] 

A|*(>)  =  (/.  -  W)Ht)Pt\k-l  (4.31) 

Kk(>)  =  +  Rky' 

where  *  =  0, 1,  •  •  •.  The  number  of  repetitions  of  the  calculations  shown  above  can 
be  determined  by  requiring  the  magnitude  of  the  difference  between  successive  state 
estimates  to  be  less  than  some  small  number. 

4. 2.4. 2  Globally  Iterated  Extended  Kalman  Filter 

Another  form  of  the  iterated  Kalman  filter,  designated  the  globally  iterated 
extended  Kalman  filter  (GIKF)  [6],  involves  restarting  the  filter  after  each  com¬ 
plete  pass  through  the  data.  After  filtering  the  K  measurements  with  the  extended 
Kalman  filter  the  covariance  is  reset  back  to  its  initial  value  and  the  filter  is  restarted 
with  the  first  measurement  but  using  the  final  estimate  from  the  previous  iteration 
as  the  new  initial  estimate.  This  technique  can  be  repeated  until  the  difference  in 
the  final  estimates  from  successive  iterations  converges  to  some  small  value.  By  re¬ 
setting  the  covariance  the  system  is  essentially  re-excited  thereby  allowing  the  state 
estimates  to  be  perturbed.  The  premise  for  this  technique  is  that  good  estimates 
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will  not  be  made  worse,  but  poor  estimates  may  be  forced  to  better  values. 

4. 2. 4. 3  Covariance  Resetting 

A  similar  procedure  can  be  applied  within  a  single  pass  of  the  data.  For 
example  whenever  the  state  covariance  converges  rapidly  to  a  relatively  small  steady 
state  value,  resetting  of  the  covariance  to  a  point  between  its  present  value  and  the 
initial  value  takes  place.  The  effect  of  resetting  is  to  re-excite  the  system  after 
steady  state  is  reached,  so  that  early  saturation  of  the  filter  gain  to  a  small  value 
that  would  prevent  changes  in  the  estimates  is  avoided.  The  disadvantage  of  this 
technique  is  that  the  variance  of  the  final  estimates  could  not  be  as  good  as  it  would 
if  no  resetting  took  place.  The  advantage  is  that  poor  estimates  may  be  forced  to 
better  values.  This  technique  could  also  be  done  through  multiple  passes  of  the  same 
data.  During  the  final  pass  the  covariance  would  not  be  reset  within  the  single  pass, 
thus  allowing  the  estimates  to  converge  to  the  best  possible  values.  This  filtering 
technique  will  be  referred  to  as  the  extended  Kalman  filter  with  resetting  (EKFR). 

4.2.5  Experimental  Results  for  Estimation  in  White  Noise 

A  well  documented  problem  that  has  been  traditionally  approached  using 
AR-based  techniques  is  the  estimation  of  the  damping  coefficients  and  frequencies 
of  two  sinusoids  [41,  52].  Referring  to  equation  (4.4)  damping  coefficient  values  were 
a*j  =  0.2  and  otjt2  =  0.1  with  normalized  frequencies  of  =  0.42  *2r  radians  and 
a>*2  =  0.52  *  2x  radians  for  0  <  k  <  24. 

The  six  nonlinear  filters  discussed  previously  are  applied  to  this  problem. 
These  filters  will  be  designated  as:  EKF  -  extended  Kalman  filter,  GSO  -  Gaussian 
second  order  filter,  MVF  -  minimum  variance  filter,  LIKF  -  locally  iterated  extended 
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Kalman  filter,  GIKF  -  globally  iterated  extended  Kalman  filter,  and  EKFR  -  ex¬ 
tended  Kalman  filter  with  covariance  resetting.  In  the  implementation  of  the  LIKF, 
using  (4.31)  the  repetition  cycle  for  a  given  sample  was  terminated  whenever  either 
the  magnitude  of  the  difference  between  successive  state  estimates  was  less  that 
0.0001  or  whenever  a  total  of  9  repetitions  were  completed.  The  GIKF  was  imple¬ 
mented  by  processing  the  entire  set  of  measurements  five  times.  The  EKFR  was 
implemented  by  resetting  the  covariance  every  three  measurements.  The  covariance 
was  reset  according  to  the  formula 

Pk ;,*(-)  =  k  =  2,5,8,  . . .  (4.32) 

where  initially  Pref  =  Pq.  After  Pj |*(— )  is  computed  the  new  reference  becomes 
Pref  =  Pfc|*(— ).  Using  the  EKFR  the  covariance  was  reset  during  the  first  two 
passes  using  (4.32).  On  the  third  and  final  pass  the  covariance  was  not  reset.  At 
the  beginning  of  e«:h  of  the  second  and  third  passes  initial  condition  on  the  state 
was  set  to  the  final  state  estimate  from  the  previous  pass. 

The  filter  performance  was  evaluated  as  a  function  of  signal-to-noise  ratio 
(SNR)  for  a  range  of  0  dB  to  30  dB.  SNR  is  defined  as  lOlog-^j,  where  a 2  is  the 
variance  of  each  of  the  real  and  imaginary  components  of  the  i.i.d.  complex  noise. 
Note  that  this  definition  gives  the  peak  SNR.  Performance  was  evaluated  at  each 
SNR  by  forming  the  sample  variance  of  the  estimation  error  over  500  independent 
noise  runs.  In  each  run  the  signal  was  kept  the  same  while  the  noise  was  modified 
using  different  random  number  seeds. 

Figure  4.1  illustrates  the  estimation  error  as  a  function  of  sample  number  for 
a  representative  run  at  20  dB  SNR.  This  figure  compares  the  relative  performance  of 
the  non-iterated  filters  (the  EKF,  GSO,  MVF)  and  the  locally  iterated  LIKF.  The 
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diagonal  elements  of  the  initial  covariance  were  set  to  a  value  of  0.04  and  the  initial 
errors  were  randomly  chosen  based  on  this  value.  Overall  this  figure  shows  that 
the  LIKF  outperforms  any  of  the  noniterated  filters.  The  MVF  and  the  GSO  give 
about  the  same  results,  and  the  EKF  gives  the  worst  performance.  In  this  example 
the  filters  perform  about  the  same  in  estimating  the  other  state  variables.  Figure 

4.2  shows  the  diagonal  elements  of  the  filter  covariance  Pm  for  the  same  example. 
This  graph  shows  that  the  covariance  elements  for  the  LIKF  converge  more  rapidly 
than  those  for  the  other  filters.  It  can  be  seen  from  this  example  that  at  20  dB 
SNR  the  filter  converges  to  its  final  state  estimate  within  about  10  samples.  Figures 

4.3  and  4.4  show  the  estimation  error  and  sample  covariances,  respectively,  at  0  dB 
SNR.  These  figures  again  demonstrate  that  LIKF  generally  outperforms  the  other 
estimators,  and  that  the  EKF  performs  the  worst.  At  0  dB  SNR  the  estimates 
stabilize  in  about  12  —  15  samples  for  Pq  —  0.04.  This  illustrates  the  suitability  of 
using  these  nonlinear  filtering  techniques  for  short  data  lengths. 

Figure  4.5  presents  the  performance  results  of  the  three  noniterative  filters  as 
a  function  of  SNR  for  Pq  =  0.01.  Each  point  on  graph  represents  the  sample  variance 
of  the  estimation  error  over  500  simulation  runs.  The  results  of  the  Kumaresan- 
Tufts  (KT)  method  and  fourth  order  cumulant  (FOC)  method  obtained  from  [52] 
are  also  shown.  The  Cramer-Rao  bound  is  also  shown.  This  bound  is  derived  in 
the  Appendix.  Figure  4.5  illustrates  that  all  of  the  noniterative  filters  give  similar 
performance  for  Pq  =  0.01.  The  performance  is  very  close  to  the  CR  bound,  partic¬ 
ularly  at  high  SNR’s.  The  results  for  the  first  sinusoid  (u>*j  =  0.42  *  2x,  =  0.2) 

are  slightly  worse  than  those  for  the  second  sinusoid  (u;*2  =  0.52  *  2 x,at2  =  0.1). 
This  is  probably  because  the  first  signal  has  been  damped  significantly  before  the 
filter  has  converged.  The  nonlinear  filter  results  are  significantly  better  than  the  KT 
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Figure  4.2  Typical  Simulation  Results  for  the  Error  Variance,  4-State  Model,  SNR=  20  dB,  Pq  =  0.04  1 
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Figure  4.3  Typical  Simulation  Results  for  the  Estimation  Error,  4-State  Model,  SNR=  0  dB,  Pq  =  0.04  / 
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Figure  4.4  Typical  Simulation  Results  for  the  Error  Variance,  4-State  Model,  SNR=  0  dB,  P0  =  0.04  / 
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and  FOC  results.  However  the  KT  method  makes  no  assumptions  about  the  noise 
statistics  and  does  not  require  initial  estimates.  The  nonlinear  filtering  techniques 
assume  that  the  noise  statistics  are  known  a  priori.  Section  4.4  presents  a  technique 
to  estimate  the  noise  statistics  on  line. 

The  results  of  the  iterative  techniques  (LIKF,  GIKF,  EKFR)  are  given  in 
Figure  4.6.  These  results  are  slightly  better  than  those  for  the  noniterative  filters 
for  SNR’s  above  10  dB,  but  slightly  inferior  for  SNR’s  below  10  dB.  Of  course  the 
price  paid  for  better  performance  is  higher  computational  requirements.  Among 
the  three  noniterative  filters  the  LIKF  performs  slightly  better  than  the  other  two, 
particularly  at  low  SNR’s. 

As  the  initial  covariance  increases  the  filter  performance  degrades.  This  is 
illustrated  in  Figures  4.7  and  4.8,  where  the  mean  squared  estimation  error  as  a 
function  of  SNR  is  shown  for  the  iterative  and  noniterative  filters,  respectively, 
for  P0  =  0.04.  The  higher  order  forms  of  the  noniterative  filters,  the  GSO  and 
the  MVF,  perform  better  than  the  EKF  especially  at  high  SNR’s.  This  satisfies 
intuition  in  that  the  second  order  approximations  for  the  GSO  and  the  assumptions 
of  Gaussian  error  distribution  for  the  MVF  should  be  more  valid  at  high  SNR’s  than 
at  low  SNR’s.  Results  for  all  filters  are  still  better  than  the  KT  and  FOC  methods. 
However  again  it  must  be  emphasized  that  the  KT  method  makes  no  assumptions 
about  the  noise  statistics  or  the  initial  estimation  error. 

Least  squares-based  techniques  such  as  the  KT  method  generally  fall  apart 
at  around  15  dB  SNR  due  to  the  so-called  threshold  effect.  A  threshold  occurs 
whenever  there  are  more  than  P  zeros  outside  the  unit  circle  in  the  singular  value 
decomposition  process.  One  of  the  advantages  of  using  filtering  methods  is  that 
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4.7  Noniterated  Filter  Performance 
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performance  smoothly  degrades  as  the  SNR  is  decreased.  That  is,  there  is  no  sharp 
dropoff  in  performance  around  the  15  dB  point  as  there  is  with  the  least  squares 
approaches.  However,  there  is  evidence  that  there  is  some  kind  of  threshold  effect 
in  the  filter  performance.  In  Figure  4.7  the  curves  representing  the  estimation  error 
for  the  two  damping  coefficients  (Figures  4.7(b)  and  4.7(d))  change  slope  between 
10  dB  and  20  dB  SNR.  At  10  dB  the  slope  changes  back  to  be  roughly  parallel 
to  the  CR-bound  curve.  As  shown  in  the  Appendix,  the  performance  of  the  filters 
has  a  lower  bound  represented  by  the  statistics  of  the  initial  estimation  error.  That 
is,  the  sample  variance  of  the  estimation  error  cannot  be  any  worse  than  the  ini¬ 
tial  covariance  Pq.  This  constrains  the  worst  case  performance  of  these  nonlinear 
estimators. 

EKF-type  algorithms  have  been  known  to  diverge  for  poor  initial  estimates. 
In  some  cases  these  poor  estimates  lead  to  poor  filter  performance  due  to  the  ap¬ 
proximations  made  by  1J<  and  2nd  order  Taylor  series  expansions.  A  test  was  devised 
to  detect  situations  with  poor  final  state  estimates  based  on  the  sample  variance  of 
the  time  series  generated  by  subtracting  the  estimated  measurement,  formed  from 
the  final  state  estimates,  from  the  actual  measurements.  This  test  is  described  in 
Section  4.4.  The  test  works  best  at  high  SNR’s  where  the  variance  of  the  signal 
plus  noise  is  significantly  better  than  the  variance  of  the  noise  alone.  The  results 
in  Figures  4.7  and  4.8  are  those  which  have  passed  this  test.  Figure  4.9  shows  the 
number  of  runs  which  passed  the  test  as  a  function  of  SNR  for  each  of  the  six  fil¬ 
ters  for  Pq  =  0.04.  In  general,  more  runs  were  discarded  by  the  noniterated  filters 
than  by  the  iterated  filters.  Among  the  iterated  filters  there  is  no  consistent  better 
performer.  The  EKF  discarded  many  more  runs  than  any  other  filter.  The  MVF 
and  the  GSO  performed  about  the  same  -  significantly  better  than  the  EKF,  but 
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worse  in  general  than  the  iterated  filters.  The  iterated  filters  not  only  discarded  less 
runs  due  to  poor  performance,  but  among  those  runs  that  were  considered  valid, 
the  iterated  filters  gave  better  overall  sample  variance. 

4.3  Harmonic  Retrieval  in  Colored  Noise 


Consider  the  case  where  the  measurement  noise  is  colored  by  the  single  pole 
filter  model 

u  k  =  +  v*_  i  (4.33) 


where  u*  is  the  measurement  noise,  and  v*  is  a  zero  mean  white  Gaussian  noise 
sequence  with  v*  ~  7V(0,  /?*).  The  parameter  'fk-i  is  the  filter  coefficient.  To 
accommodate  the  colored  noise  state  augmentation  or  measurement  differencing 
can  be  applied  when  7*_i  is  known.  The  complete  derivation  of  the  filter  equations 
in  colored  noise  is  given  in  [65].  The  plant  and  measurement  models  for  the  colored 
noise  model  are  given  by 


Xfc+l  =  x* 

z*  =  h*(x*)  +  ufc 

Here,  xo,  vq,  and  u*  are  mutually  independent  and  Gaussian. 


(4.34) 


We  have  xo 


N(xq,Pq),  and  u0  ~  N(0,Uo). 


4.3.1  Colored  Noise  -  Known  Filter  Coefficient 


When  the  filter  coefficient  7*_i  is  known,  a  set  of  equivalent  “derived”  mea¬ 
surements  is  obtained  by  subtracting  7t_i  Zk-i  from  (4.34)  to  obtain 


z*  =  z*  -  7i-i**-i  =  hfc(xfc_!)  -  7*_ih*_i(x*_i)  +  vk  (4.35) 

where  x*  =  xj^-j.  The  augmented  measurement  Zk  is  a  nonlinear  function  of  the 
state  with  additive  white  noise. 
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Figure  4.9  Noise  Discriminator  Results  for  the  Six  Nonlinear  Filters,  4-State  Model,  Pq  =  0.04 
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By  applying  the  usual  extended  Kalman  filter  linearization  of  the  measure¬ 
ment  equation,  the  filter  equation 


*kjk  —  x*— i|fc— i  +  Kk  [zfc  ~  jft-HMl  (4.36) 


is  obtained,  where  Ht  =  Ht  —  it-iHt-i  and  both  Ht  and  Ht-i  are  evaluated  at 
x*— l|jt_ 1-  The  filter  gain  and  covariance  propagation  equations  are  now  given  by 


Kt  =  +  Rt)-' 


flb|t  =  (.In  -  KkIik)Pk\k-l. 


(4.37) 


The  initial  conditions  for  this  estimator  are  given  by 
£[xo]  =  Xo  +  Var[x0]tf|f[tfoVar[xo]tf^  +  floP^Zo  -  #oXo] 

(4.38) 

Po  =  Var[x0]  -  V*r(xo]fff (^0 Varfxo]^  +  Ro]~' H0Var[x0] 

4.3.2  Colored  Noise  -  Unknown  Filter  Coefficient 

If  the  filter  coefficient  7*  is  unknown,  the  state  vector  can  be  augmented  to 
include  the  unknown  parameter  7 *.  The  augmented  state  vector  is  defined  as 


Ht  is  then  defined  by 


«k=^ 


dxi 


(4.39) 


(4.40) 


Let  gt(xt)  =  jtht(xt),  such  that 
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Gk  = 


dg  fc(xt) 


dxk 


^=*4-114-1 


If  Hk  =  Hk  —  G/c-i  then  the  filter  equations  are  given  by 


(4.41) 


x*|*  =  Xfc.jji.j  +  Kk[zk  -  Hkxk-i\k-i\ 

Pk\k  =  Un  -  KicHh)?^!  (4.42) 

Kt  =  +  Rt)~'. 

4.3.3  Experimental  Results  for  Estimation  in  Colored  Noise 

The  EKF  was  used  to  estimate  the  model  parameters  of  (4.6)  with  colored 
noise  given  by  model  (4.33).  The  simulation  results  from  500  Monte  Carlo  trials 
are  presented  in  Figure  4.10  for  initial  uncertainty  Pq  —  0.047.  The  filter  coefficient 
is  7t_i  =  0.8.  Results  are  shown  in  this  figure  for  case  of  known  and  unknown 
7*.  The  solid  lines  in  the  four  plots  show  the  white  noise  CR  bound,  which  is  not 
applicable  in  this  case,  but  is  included  as  a  reference.  Figure  4.10(d)  shows  the 
EKF  performance  when  the  filter  coefficient  is  unknown  and  estimated  along  with 
the  model  parameters.  Due  to  the  frequency  response  of  the  colored  noise  filter 
one  would  expect  the  estimates  for  the  first  sinusoid  to  be  slightly  worse  relative 
to  the  white  noise  CR  bound  than  for  the  second  sinusoid.  That  is,  the  gain  due 
to  the  colored  noise  filter  at  the  frequency  of  the  first  sinusoid  is  higher  than  that 
at  the  second  sinusoid.  This  is  verified  by  Figure  4.10.  This  figure  also  shows  that 
the  estimation  results  of  the  model  parameters  when  the  coefficient  7t_i  is  unknown 
and  is  estimated  on-line,  are  only  slightly  inferior  to  the  results  when  7t_j  is  known. 
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Figure  4.10  EKF  Performance  in  Colored  Noise  with  Known  and  Unknown  Noise  Coefficient,  4-State  Model,  Pq  =  0.04  / 
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4.4  Estimation  of  the  Measurement  Covariance 


In  many  situations  of  practical  interest,  the  measurement  noise  R  is  un¬ 
known.  In  the  following  development  it  is  assumed  that  the  measurement  noise  is 
stationary  over  all  of  the  measurements  and  that  the  states  are  constant.  An  initial 
estimate  of  the  noise  Rq  is  used  to  initialize  the  filter.  Given  the  measurements 
Zk  —  (zi;z2;  ;z k),  Rk  can  be  estimated  by  taking  the  sample  variance  of  the 
innovations  error  at  each  iteration  of  the  filter.  Let  Sk  be  defined  by 


St  = 


=  HtPktk-,Hl  +  Rt- 


(4.43) 


Given  Hk  and  Pk\k-i  an  estimate  of  St  is  required  in  order  to  obtain  an  estimate 
for  the  measurement  noise  Rk-  Let  0*  be  the  matrix  of  estimates 


0*  =  hi(x1)|Xl=ijt|jt_1;h2(x2)|X2=iJklt_1;  •  •  • ,  ;  •  (4.44) 


0*  and  Zk  are  m  by  K  matrices  where  m  is  the  number  of  elements  in  the 
measurement  vector,  and  K  represents  the  total  number  of  time  intervals.  Let  the 
innovations  error  matrix  be  defined  by 


Ek  =  Zk-  0  *, 


(4.45) 


and  let  be  the  ith  row  of  £*.  Assuming  that  the  innovations  are  zero  mean,  an 
estimate  of  the  ijth  element  of  5*  can  be  obtained  using  the  sample  variance 


1  t 

Frr'*ie‘.- 


(4.46) 
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The  measurement  variance  can  then  be  estimated  using 

&k  =  $k  ~  HkPk\k-iHk  ■  (4-47) 

In  the  situation  where  the  is  equal  to  zero,  that  is  when  the  estimates 

are  perfect,  the  parameter  s gives  the  sample  variance  of  the  measurement  noise. 
The  statistic  (K  —  l)s^.  is  chi-square  distributed  with  K  degrees  of  freedom.  The 
chi-square  distribution,  represented  by  the  sample  variance  statistic,  has  a  99.9% 
probability  of  being  less  than  or  equal  to  2<r2  for  K  =  50,  where  cr2  is  the  variance  of 
the  measurement  noise.  The  statistic  s^.  can  be  used  as  a  way  to  detect  poor  state 
estimates.  A  reasonable  criterion  is  to  reject  the  estimate  as  probably  bad  if  is 
greater  than  2<r2.  This  criterion  works  very  well  at  high  SNR’s  where  the  variance  of 
signal  plus  noise  is  significantly  different  from  the  variance  of  noise  alone.  However, 
at  low  SNR’s  the  variance  of  signal  plus  noise  can  be  close  to  the  variance  of  the 
noise  alone  and  this  test  does  not  work  as  well. 

4.4.1  Experimental  Results  -  Unknown  Noise  Statistics 

Experimental  results  of  the  extended  Kalman  filter  with  unknown  noise  co- 
variance  are  shown  in  Figure  4.11.  This  figure  gives  two  sets  of  curves.  One  set 
shows  the  filter  performance  with  one  pass  through  the  data.  During  this  pass  the 
noise  statistics  are  estimated  at  each  iteration  using  equation  (4.35).  The  initial 
noise  covariance  was  estimated  cr2  =  0.05,  corresponding  to  10  dB  SNR,  for  all 
points  on  this  curve.  The  second  curve  shows  the  results  of  processing  the  data 
with  two  passes.  During  the  first  pass  the  noise  is  estimated  as  in  the  first  case. 
During  the  secon  1  pass  the  noise  covariance  estimate  is  held  constant  at  the  final 
value  from  the  first  pass  and  the  data  is  filtered  using  the  normal  extended  Kalman 
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filter  relations.  For  this  second  pass  the  initial  state  covariance  is  reset  to  0.04. 
The  initial  state  estimate  for  the  second  pass  was  equal  to  the  final  state  estimate 
from  the  first  pass.  The  results  show  that  the  two-pass  filter  performs  significantly 
better  than  the  single-pass  filter,  particularly  at  high  SNR’s.  At  low  SNR’s  the  two 
filters  result  in  about  the  sarnie  performance.  Comparing  Figure  4.11  to  Figure  4.7 
the  two-pass  filter  with  initially  unknown  noise  statistics  results  in  about  the  same 
performance  ais  the  (single-pass)  extended  Kalman  filter  with  known  noise  statistics. 

4.4.2  Experimental  Results  -  Single  Sinusoid 

A  study  was  also  performed  on  a  single  sinusoid  model.  This  model  was 
formed  by  letting  P  —  1  ,c*j  =  l,a*j  =  0.12,0;^  =  0.22*2x,  and  0^  =  0  inequation 
(4.6).  The  unknown  random  parameters  in  this  system  were  and  The 
sample  variances  for  all  six  of  the  filters  is  given  in  Figure  4.12  for  initial  estimation 
error  variance  Po  =  0.09  for  both  state  variables.  Figure  4.13  presents  the  sample 
variance  for  the  model  with  initial  estimation  error  uniformly  distributed  between  0 
and  2x  for  the  frequency,  corresponding  to  no  a  priori  information,  and  with  initial 
estimation  error  uniformly  distributed  between  0  and  1  for  the  damping  coefficient. 
The  results  are  similar  to  those  in  Figure  (4.12)  for  high  SNR,  especially  for  the 
LIKF.  However,  for  low  SNR  the  performance  is  significantly  worst  in  (4.13)  than 
in  (4.12).  These  results  satisfy  intuition  in  that  whenever  there  are  less  parameters 
to  be  estimated  the  filter  can  sustain  larger  initial  estimation  error. 

4.5  Conclusion 

Methods  batsed  on  nonlinear  recursive  filters  for  estimating  the  parameters 
of  exponentiadly  damped  sinusoids  in  white  and  colored  noise  have  been  described. 
Filter  equations  have  been  developed  for  time  varying  systems  in  white  and  colored 


Figure  4.11  EKF  Performance  in  White  Noise  with  Unknown  Noise  Covariance,  4-State  Model,  Pq  =  0.04  I 
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Figure  4.12  Performance  of  All  Six  Nonlinear  Filters,  2-State  Model,  Pq  =  0.09  / 


Figure  4.13  Performance  of  All  Six  Nonlinear  Filters,  2-State  Model,  Large  Pq 
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noise  with  known  and  unknown  noise  covariances.  Simulation  results  for  the  prob¬ 
lem  of  estimating  the  parameters  of  two  exponentially  damped  sinusoids  show  that 
the  nonlinear  filtering  techniques  described  perform  very  close  to  the  Cramer-Rao 
bound,  even  at  low  SNR’s,  for  relatively  small  initial  estimation  errors.  For  larger 
values  of  the  initial  uncertainty  on  the  model  parameters  the  iterated  forms  of  the 
extended  Kalman  filter  give  better  performance  than  the  noniterated  forms.  Among 
the  noniterated  forms  the  Gaussian  second  order  filter  and  the  minimum  variance 
filter  give  comparable  performance,  and  both  perform  better  than  the  extended 
Kalman  filter,  particularly  at  high  SNR’s.  In  addition  these  two  high  order  filters 
are  generally  more  stable  as  evidenced  by  the  number  of  final  state  estimates  that 
passed  the  noise  discriminator  test.  The  extended  Kalman  filter  has  been  shown  to 
give  good  results  in  colored  noise  with  known  and  unknown  noise  filter  parameter. 
A  technique  has  also  been  developed  for  on-line  estimation  of  the  measurement  noise 
covariance. 

In  summary  the  following  general  observations  can  be  drawn  about  the  per¬ 
formance  of  nonlinear  filters  for  harmonic  retrieval: 

*  The  nonlinear  filters  incorporate  a  priori  knowledge  about  the  state.  The 
KT  method  has  no  inherent  capability  to  use  a  priori  knowledge. 

*  As  with  the  KT  method  the  nonlinear  filter  method  approaches  the  CR  bound 
at  high  SNR’s.  However,  the  performance  of  the  nonlinear  filters  does  not 
degrade  sharply  in  the  range  of  10  to  15  dB  as  the  performance  of  the  KT 
method  does.  Worst  case  performance  of  the  nonlinear  filters  is  bounded  by 
the  initial  error  covariance. 


*  The  nonlinear  filters  can  estimate  parameters  in  colored  noise.  The  KT 
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method  is  not  designed  to  operate  in  colored  noise. 

*  The  filters  converge  relatively  fast,  making  the  nonlinear  filters  suitable  for 
short  data  lengths. 

*  The  nonlinear  filters  are  recursive  in  nature,  thereby  providing  adaptability 
to  time  varying  parameters. 

*  The  primary  disadvantage  of  the  nonlinear  filter  methods  is  that  they  may 
require  good  initial  estimates  to  converge  to  a  valid  solution.  The  KT  method 
does  not  share  this  problem.  However,  the  multi-filter  resolution  approach, 
which  is  defined  in  Chapter  5  and  implemented  in  Chapter  7  for  time  delay 
estimation,  can  be  used  to  accommodate  poor  initial  conditions  by  parti¬ 
tioning  them  into  smaller  intervals  of  uncertainty  and  applying  joint  detec¬ 
tion/estimation  techniques  for  resolving  ambiguity. 

*  Another  interesting  approach  to  performing  harmonic  retrieval  in  with  large 
initial  estimation  error  would  be  to  use  the  KT  method  to  initialize  the 
nonlinear  filter.  The  nonlinear  filter  would  then  be  to  refine  the  estimates. 


125 


Chapter  5 

Joint  Detection/Estimation 


This  chapter  presents  a  procedure  for  combining  detection  and  estimation 
theory.  This  procedure  is  used  in  subsequent  chapters  for  selected  signal  process¬ 
ing  problems.  Two  general  applications  of  joint  detection/estimation  theory  are 
addressed.  In  the  first  application  the  nonlinear  measurement  model  is  constant, 
but  the  initial  estimation  error  is  large  enough  such  that  the  approximations  made 
by  nonlinear  estimators  such  as  the  extended  Kalman  filter  may  lead  to  very  poor 
performance.  In  this  case  the  a  priori  pdf  can  be  partitioned  into  M  smaller  subre¬ 
gions.  Each  subregion  is  associated  with  a  hypothesis  and  the  problem  is  treated  as 
an  A/-ary  hypothesis  joint  detection/estimation  problem.  In  the  second  application 
the  measurement  model  is  unknown.  Again,  several  hypotheses  are  proposed.  Each 
hypothesis  is  associated  with  a  different  model.  For  each  model  the  state  variables 
are  estimated  on  line.  This  operation  is  performed  concurrent  with  estimation  the 
a  posteriori  probability  of  each  hypothesis.  The  states  are  not  constrained  to  be 
common  among  the  models.  A  third  application  involves  a  combination  of  the  other 
two  applications.  This  chapter  presents  the  general  technique  for  applying  joint 
detection/estimation  theory  to  these  applications. 

The  traditional  estimation  theory  approach  to  solving  parameter  estimation 
problems  involves  starting  with  initial  estimates  of  the  state  variables  and  refining 
these  estimates  by  filtering  the  measurement  data.  The  performance  of  the  filter  is 
governed  by  the  statistics  of  the  process  and  measurement  noises,  and  by  the  process 
and  measurement  models.  Filters  such  as  the  Kalman  filters  may  exhibit  unstable 
behavior.  For  nonlinear  process  or  measurement  models  the  approximations  result- 
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ing  from  truncated  Taylor  series  expansions  used  in  the  implementation  of  the  EKF 
may  also  lead  to  poor  performance.  It  is  well  known  that  under  certain  conditions 
the  extended  Kalman  filter  may  give  poor  results  due  to  the  approximation  made 
with  the  first  order  Taylor  series  expansion.  This  is  particularly  evident  when  the 
initial  estimation  error  is  large.  This  problem  is  discussed  in  the  context  of  the 
harmonic  retrieval  problem  in  Chapter  4. 

Af-ary  detection  is  used  in  a  number  of  practical  signal  processing  prob¬ 
lems.  One  example  of  this  type  of  detection  is  ambiguity  function  processing  for 
radar/sonar  signal  processing.  This  process  involves  the  convolution  or  matched 
filtering  of  a  received  signal  with  a  number  of  signal  replicas.  Each  replica  has  a 
different  estimate  of  the  unknown  states  (e.g.  amplitude,  delay,  Doppler  shift).  The 
replica  that  results  in  the  largest  value  of  the  ambiguity  function  is  chosen  is  used 
to  determine  the  state  estimates.  Autoregressive  model  order  selection  may  also  be 
treated  usiug  Af-ary  detection  theory  where  the  M  hypotheses  correspond  to  all  of 
the  combinations  of  a  discrete  set  of  frequencies.  These  two  problems  are  addressed 
in  Chapters  6-8  of  this  thesis  using  the  joint  detection  /estimation  approach  which 
is  presented  in  this  chapter. 

The  joint  detection/estimation  (JD/E)  approach  combines  Af-ary  detection 
with  estimation.  The  distinction  of  JD/E  over  pure  M-ary  detection  is  that  the 
estimates  are  refined  for  each  hypothesis.  JD/E  may  permit  the  use  of  a  smaller 
number  of  hypotheses  than  detection  only  since  the  hypotheses  are  continuously 
refined  through  estimation,  which  is  performed  concurrently  with  the  hypothesis 
testing.  Alternatively,  one  may  use  the  same  number  of  hypotheses  as  the  detection 
only  problem,  but  by  refining  the  estimates  through  filtering,  better  estimates  may 
result.  A  major  reason  for  using  JD/E  in  nonlinear  filtering  problems  is  to  reduce 
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the  effects  of  large  initial  estimation  error  on  the  truncated  Taylor  series  expansion 
used  in  the  EKF. 

In  this  chapter  a  recursive  technique  for  joint  detection /estimation  is  de¬ 
veloped  based  on  nonlinear  state  and  measurement  models.  The  objective  is  to 
develop  a  procedure  that  results  in  an  optimal  minimum  variance  estimate  for  the 
state  variables  for  each  hypothesis  and,  given  this  optimal  estimate,  to  select  the 
proper  hypothesis  which  most  closely  matches  the  measurement  data. 

The  development  of  the  joint  detection/estimation  is  based  on  the  segmen¬ 
tation  of  the  unknowns  into  a  state  vector  xjt  and  a  parameter  vector  0.  Let  0  6© 
designate  the  parameter  vector  that  describes  the  different  models  that  may  have 
generated  the  measurements.  Each  model  is  identified  with  a  specific  hypothesis, 
and  corresponds  to  a  unique  0,  0  6©.  The  set  ©  is  assumed  to  be  countable  (in 
our  application  also  finite).  In  addition,  the  parameter  vector  0  is  assumed  time 
invariant.  The  development  of  the  joint  detection/estimation  presented  here  follows 
a  similar  procedure  to  that  presented  by  Fredriksen  et  ai  [58].  However,  these  au¬ 
thors  combined  the  state  vector  x*  and  the  parameter  vector  0  together  into  one 
state  vector  that  was  used  in  the  estimation  process.  In  addition,  it  was  required 
that  all  of  the  variables  in  the  augmented  state  vector  be  energy  variables.  In  the 
development  that  follows  a  distinction  is  made  between  the  state  vector  x*  and  the 
parameters  0.  The  state  variables  are  the  same  for  each  hypothesis,  while  the  vector 
0  is  used  to  distinguish  between  the  various  hypotheses.  There  is  no  restriction  on 
the  state  vector  x*.  The  parameter  vector  0  is  assumed  to  be  time  invariant.  Under 
hypothesis  H$  the  discrete  time  measurements  are  modeled  according  to 


Hi  ■■  Zk  =  hi(xi,0)  +  \t,i 


(5.1) 
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where  the  state  x*  is  common  for  all  0  €  ©,  and  satisfies  the  discrete  time  process 
equation 

xfc  =  fk-i  (xi_!,5)  +  wt_i^  (5.2) 

with  initial  state  estimate  x0|q^,  and  initial  state  covariance  The  initial 

state  estimate,  the  measurement  noise,  and  the  process  noise  are  uncorrelated  up 
to  the  moment  order  required  by  the  implemented  filter  (e.g  EKF  or  EHOF).  The 
process  and  measurement  noise  are  zero  mean  and  distributed  with  covariances 
£[wMwm1  =  Qkfi,  and  £[vMvj[9]  =  Rkfi. 

The  model  structure  in  (5.1)  and  (5.2)  is  similar  to  the  traditional  state  and 
observation  models  with  the  exception  that  an  unknown  time  invariant  parameter 
vector  0  is  used  to  distinguish  between  the  various  hypotheses.  This  model  structure 
is  designed  to  accomodate  a  large  class  of  process  and  measurement  models.  The 
only  restriction  is  that  the  state  vector  is  the  same  for  each  hypothesis.  Note, 
however,  that  not  all  of  the  elements  of  the  state  vector  x*  are  required  to  be 
estimated  under  each  hypothesis. 

It  is  assumed  that  0  has  known  probability  density  function  p(0).  The  vector 
0  is  required  to  contain  coefficients  of  a  sufficient  set  of  energy  parameters  (e.g. 
amplitude,  time  duration)  such  that  the  null  hypothesis  is  indicated  whenever  this 
set  of  parameters  is  equal  to  zero.  Although  it  is  not  mandatory  that  all  of  the 
parameters  in  0  be  zero  to  indicate  the  null  hypothesis,  it  simplifies  the  discussion 
of  the  method.  Thus,  under  Ho,  Bo  =  0,  and  h*(x*,  0o)  =  h*(x*,0)  =  0. 

5.1  A  Bayes  Test  for  Joint  Detection/Estimation 

The  Bayesian  approach  for  optimum  detection  involves  the  minimization  of 
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the  average  decision  cost  over  all  possible  decisions  under  all  possible  hypotheses. 
Let  Z*  =  {zi,  Z2,  •  •  •  z*},  be  the  collection  of  measurements  that  are  functions  of  the 
time  varying  state  variables  x*,  and  the  unknown  time  invariant  parameter  vector 
9.  The  total  average  cost,  or  Bayes  Risk  for  joint  detection/estimation,  of  making 
the  decisions  Dj  associated  with  hypotheses  Hj,  j  =  0,  •  •  • ,  M  can  be  expresses  as 

&DttE  =  Yl  L  f  [  Cj(*k)p(zk,'Xk,6\Z‘k-i)d0dxkdzk  (5.3) 

j=oJRj J  J 

where  Cj(xk)  is  the  cost  of  making  decision  j ,  and  p(z*,x*,0|Zjt_i)  is  the 
density  function  of  all  of  the  random  parameters  in  the  system,  given  the  measure¬ 
ments  up  to  k  —  1.  The  goal  is  to  find  the  estimate  that  minimizes  the  total 
average  cost.  By  conditioning  the  decision  probability  on  the  past  measurements  a 
recursive  technique  for  joint  detection  estimation  can  be  developed. 

If  the  cost  Cj(.)  is  not  a  function  of  the  state  x*  (i.e,  the  detection  only  case 
[75]),  then  Cj  can  be  moved  outside  of  the  integral  and  the  risk  becomes 

*  =  (5-4) 

>=0 

where  P{Dj\Zk-\)  is  given  by 

xfc,  0|Z*_i  )d0dxkdzk  (5.5) 

By  applying  Bayes  rule  to  the  joint  density  function  in  (5.3),  the  Bayes  risk  for  joint 

detection/estimation  can  be  expressed  as 

XDUE-Y.L  [  f  Cj(xk)p(zk\xk,9)p{xk\Zk.i,0)p(9\Zk.1)dedxkdzk  (5.6) 

y=Q  *  Aj  *  J 

where  p{ z*|xk,0)  =  p(zk\Zk-i,xk,8)  because  of  the  Markov  noise  process  in  (5.1). 
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That  is,  it  is  easily  seen  from  (5.1)  that  if  the  x*,  and  0  are  known,  then  the  density 
function  for  zk  is  a  function  of  the  measurement  noise  v*  only.  Let  the  parameter 
space  spanned  by  0  be  discrete  such  that  it  can  be  characterized  by  a  finite  number  of 
quantized  points.  Then  the  a  priori  probability  density  of  0  given  the  measurements 
Z*_i  can  be  expressed  as 

M  M 

K«|Zi-l)  =  E  miZ*-i)Pi(#|Z»-,)  =  E  m  |Zt-i)*(»  -  «i)  (5.7) 

i-0  1=:0 

where  P(0 s|Zjt— i)  is  the  a  priori  probability  of  hypothesis  P,  given  the  measurements 
Z*_!.  Thus,  P(0,|Z*_j)  can  be  used  in  place  of  the  more  conventional  P(/f;|Zjt_i) 
to  demonstrate  the  explicit  dependence  of  each  hypothesis  on  the  parameter  0{.  The 
hypothesis  Hi  then  corresponds  to 


Hi  :  Zk  =  hk{xk,0i)  +  vktS. 


(5.8) 


Given  the  measurements  Z*_j ,  the  cost  associated  with  hypothesis  j  [75,  pp. 
140-141]  can  be  expressed  as 

M 


M 


=  x  1 m\Zk-i)W-0i ) 

§  J,{  *'*’  MZk-i) 


(5.9) 


where  CM  is  the  cost  of  deciding  Py  given  Pi  is  true.  Substituting  (5.9)  into  (5.6), 
the  Bayes  risk  becomes 
M  M 

&DLE  -  £  12  JR  J  p(Wk~i)Cji{xk\k}xk)p{zk\xk,0i)p{xk\Zk^u0i)dxkdzk 

(5.10) 

The  risk  associated  with  deciding  hypothesis  Po,  the  null  hypothesis,  is  found  by 
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evaluating  (5.10)  for  j  =  0.  Noting  that  if  the  decision  regions  are  disjoint,  then 
Rq  =  R  —  J2jLi  Rji  and  this  risk  becomes 

Co(xk)P(Do\Zk-i)  =  12  L  f  p(ei\zk-i)Coi(xk)p(zk\xk,Oi)p(xk\Zk-1,0i)dxkdzk 

,=o  JRo  J 

-i)^o.(x*)p(2ijxt,  «i)p(xi|Z*_i,  0i)dxkdzk 


4Wm|z‘-l)C“(Xi) 


x  p(*»N,#)Kx»|Zt-i,  #.)*£»&» 

(5.11) 


where  the  explicit  dependence  of  Coi(xjt)  on  x*)*  has  been  removed  because 
an  estimate  is  not  required  whenever  hypothesis  Ho  is  decided.  Using  the  result 
(5.11)  in  (5.10)  the  total  risk  now  becomes 


&DbE  =  12  L  f  p(ei\Zk-l)Coi(xk)p{zk\^k,0i)p{xk\Zk.uei)dxkdxk 
i=0  JR  J 

M  M 

+SS4/m|z‘  -i)[Cji(xk\k,xk)-Coi(xk)] 
x  p(z*|x*,  6i)p(xk\Zk_i,6i)dxkdzk 


(5.12) 


Under  the  null  hypothesis  Ho,  the  cost  is  not  a  function  of  x*.  Thus, 
C/o(x*|*>xjfc)  =  C/o(x*|jb).  In  addition,  Coo  is  neither  a  function  of  x*|*  nor  of 
Xfc.  Furthermore,  under  the  null  hypothesis,  p(zk\xk,$o)  =  p(z*|0o)-  That  is,  the 
density  function  for  the  measurements  is  not  a  function  of  the  state  variables.  This 
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leads  to 

/P(*o|Z*_0[Cio(x*,*)  -  Coo\p(zk\xk,6o)p{xk\'Zk-u0o)dxk  = 


^(0o|Z*_i)[C,-o(xj.|*)  -  CQo)p(zk\0o) 

(5.13) 


By  defining  the  likelihood  ratio 


.  .  ,  -P(0i|Z4-i)  /[<?oi(x»)  -  C,-.(x41i, Xi)] p(zt|xi, 0,)p(x(.|Zt_, .SiJcfxj 

,iW  ~  P(0„|Zi-1)[Cio(xt|t)-C„o]p(zt|0o) 


(5.14) 


the  Bayesian  risk  can  now  be  expressed  as 

M 


&DLE  =  JlJRf  P{Oi\Zk-i)Coi{xk)p(zk\xk,ei)p(xk\Zk-i,ei)dxkdzk 


E  L  m\Zk-i)iCM*k\k)-Coo]p(zk\0Q) 
j= i  JKj 


1  -  53 
«=1 


dzk 

(5.15) 

The  first  term  on  the  RHS  of  (5.15)  is  constant.  Hence,  it  does  not  contribute 
to  the  selection  of  the  decision  boundaries.  It  is  assumed  that  the  cost  making  a 
wrong  decision  Cji(x k\k,xk),  j  ^  t,  is  greater  than  the  cost  of  making  a  correct 
decision  C,-,(x*|*,  x*).  Since  all  probabilities  and  density  functions  are  positive  or 
zero,  P(flo|Z*_i)  (Cyo(xjt|*) —  Coo] p(2fc|^o)  >  0.  Thus,  the  decision  is  made  in  favor 
of  hypothesis  Hj  based  on  the  selection 


lin,  |[Cj 


j  =  argmin,  (  [Cyo(x*|*)  -  Coo] 


(5.16) 


This  decision  rule  is  based  on  the  fact  that  the  estimate  xk\k  is  optimum. 

The  second  part  of  the  joint  detection/estimation  procedure  is  to  find  the 
optimum  estimate.  The  optimum  estimator,  given  that  decision  Dj  was  decided 
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after  the  first  stage,  will  be  denoted  xj|fc  and  is  determined  by  finding  the  value  that 
minimizes  the  total  average  cost  in  decision  region  Rj.  That  is,  is  determined 
from  the  condition 


min**l*  -*Wz*-i)  [CM**!*)  “  ^oo]p(zt|^o) 


1  ~  £  LiiM 

i=i 


dzk.  (5.17) 


The  exact  form  of  the  optimum  estimator  is  a  function  of  the  type  of  cost  function 


5.1.1  Quadratic  Cost  Function 

Consider  the  case  where  the  cost  is  a  quadratic  function  of  the  estimation 
error.  The  quadratic  cost  is  expressed  as 


Cji(Zk\k,  x*)  =  bji  +  cji  [x*  -  Xi|*]T[xjfc  -  x*,*]  (5.18) 


The  cost  includes  the  term  bji  which  represents  the  conventional  cost  associated  with 
the  problem  of  detection  only,  and  the  cost  cji  which  accounts  for  the  estimation 
error.  The  cost  of  deciding  hypothesis  Dj  given  hypothesis  Ho  is  related  to  the 
estimate  only  and 

Coo  =  coo 


Co.-(xfc)  =  6o»  +  coixjxjfc  *  ±  0 

(5.19) 

CM*k\k)  =  bjo  +  cj0x£|*xt|*  j^Q 
Cji(*k\k,Xk)  =  bji  +  cji  (x*  -  xjk|fc]r[xt  -  x*|*]  *  ^  0,j  ^  0 


Using  the  cost  function  (5.18),  the  integral  in  (5.17)  can  now  be  expressed  as 
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Ij  =  Jr  P(0o\Zk-i)p(zk\Oo)G(zk,Xk\k)fak 


(5.20) 


where 


G(zk,xklt)  =  £  P(6i\Zk_i)  f  [bji  -  M p(zk\xk, 0.)p(x*|Zfc_1, 0i)dxk 
»=o  J 

+  E  miz*-i)  J [cji  -  cci]  [x*  -  x*|*]T[x*  -  x*|*] 


(5.21) 


x  p(z*|x*, 0,)p(x*|Zjfc_i,0j)<fx* 


The  first  term  on  the  right  hand  side  of  the  above  equation  is  independent  of  x^*  and 
can  be  excluded  from  consideration  in  determining  the  optimum  estimator.  Since 
it  is  assumed  that  the  cost  of  making  a  wrong  decision  is  greater  than  the  cost  of 
making  a  correct  decision  (cji  >  Cjj),  G(zkrxkjk)  is  always  positive  or  zero  .  So  if  the 
integrand  is  minimized  then  the  integral  will  also  be  minimized  and  the  optimum 
value  of  x^  is  found  by  differentiating  the  integrand  with  respect  to  the  estimate 
and  equating  it  to  zero. 


dx, 


*1* 


=  0 


**!*=*• 


*|* 


(5.22) 


Carrying  out  this  minimization  the  optimum  value  of  the  state  given  hypothesis  Hj 
is  determined  from 


M 

EfCn  —  Coi 
i=0 


]  P(9i\Zk-i)  J[xk  -  xj(i) p( zjt | xjt , 0j )p(x^ | Zjt _ i , 0,) dxt  =  0  (5.23) 


which  gives 

*'*  £So[<*  -  «*]  m|Z»-l)  1  pfnIxi.ftJKxjlZt-,,  O.J.ixj  ' 


It  is  observed  that  no  estimation  is  performed  under  hypothesis  Hq.  Thus,  the 
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density  function  p(xfc|Z*_i,  0o)  is  equal  to  zero,  and 


[eg  -  cm]  P(0o|Zfc_i)  J x* p(z*|x*,0o)p(xfc|Zfc_i,0o)dxfc  =  (5-25) 


Using  this  result  in  (5.24)  yields 


M 


•  si 


where 


and 


From  Bayes  rule 


-  _  I Xfc p(zt|xt,  ^)p(xt|Zfc_i,g,)dxfc 

*,M<  f  p(*k\Xk,0i)p{xk\Zk-i,6t)<tx-k 


and  becomes 


.  _  /xtp(xt|Zt,g,)dxt 

t|Mi  /  ' 


(5.26) 


r  ,  \  _  [eg  ~  co,]P(fl,lZt_i)  /  p(zt[xt,^)p(xt|Zt-i,^)dxt 

*  EmsolCjm-COml^m)  /p(2t|Xfc,^m)p(Xi|Zfc_i,0m)dxl 


(5.28) 


p(zt|xjki^t)p(xfc|Zjfc_1,^i)  =  p(xjfc|Z*,0;)p(z*|Z*-i,0;).  (5.29) 


(5.30) 


This  is  the  mean  of  the  a  posteriori  density  function  of  x*  given  the  measurements 
Zjt  and  hypothesis  Hi.  If  it  is  now  assumed  that 


[eg  COi]  —  [cii  —  COiJj 


0  <  i,j,  k  <  M 


(5.31) 
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that  is,  the  cost  of  all  errors  is  the  same,  then  (5.27)  becomes 

p  ,  ,  _  p(t .|Zt-i)  f  p(xlIZk,6,)p(ztIZi-I,e,)dxt 

(5.32) 

The  denominator  is  the  marginal  density  of  zjt  given  the  measurements  Z^-i,  and  it 
can  be  shown  ([77],  p.85)  that  r,-(zt)  =  P(0i|Zfc),  the  a  posteriori  probability  of  9% 
given  the  measurements  Z*.  This  gives  the  recursion  for  the  a  posteriori  probability 
of  hypothesis  H{  as 


P{6i\Zu)  = 


P(^|Zt-1)p(zt|Zt_1,gQ 

pfafc|Z*-i) 


Substituting  the  likelihood  ratio 


Ai(z*)  — 


Kz*[Zfc-i,0«) 

p(z*|Z*_i,50) 


into  (5.33),  the  a  posteriori  probability  becomes 


P(0j|Zjfc)  =  -  - 

E^=0^m|Zfc-,)ATO(zfc) 


(5.33) 


(5.34) 


(5.35) 


Under  the  conditions  (5.31)  the  optimal  estimate  becomes 

=  (5-36) 

1=1 

So  the  optimal  estimate  is  the  sum  of  all  of  the  conditional  means  weighted  by  the 
a  posteriori  probability  of  each  hypothesis. 

Equations  (5.35)  and  (5.36)  are  verified  by  Lainiotis  [59]  for  Gaussian  dis¬ 
tributions.  The  implementation  of  the  above  procedure  involves  the  operation  of 
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several  extended  Kalman  filters  in  parallel  -  one  filter  for  each  possible  value  of  the 
parameter  0,.  In  addition  to  estimating  the  state  x^*^.  at  each  iteration  of  the 
Kalman  filter,  the  a  posteriori  probability  of  0,-,  P(0,jZjt),  is  also  propagated.  The 
optimal  mean  square  estimate  x*|*  of  x*  is  the  integral  over  all  values  of  0j  of  the 
estimates  from  each  of  the  filters  weighted  by  the  a  posteriori  probability  of  each 
value  of  0,. 

The  error  covariance  is  given  by 

Pk\k  =  E  {[x*  -  xt|A][xjt  -  x*)Jt]r|Z*} 

which  can  be  determined  from  the  relation 

A  It  =  /  E  {[xt  -  Xi|J[xt  -  xlit\T\Zkj}p(e\Zk)dO. 

Substituting  (5.7)  gives 

An  =  E£  {[**  -  **it)!x*  -  %)T|zt,«,}  P(ti  |Zi) 

i=l 

=  E  {E{[Xk  ~  X*|M|.][X*  -  Xt)M  ]T|Zt,0,} 

*=i  (5.39) 

+  [x*|i  -  x*|M,P*|*  -  xi)M.]T]  P(0,|Zt) 

=  E  [Pk\k,$i  +  AA|Mj]  P(Wk) 

i=l 

where  it  is  noted  that  P^t  is  not  defined  under  hypothesis  0o-  pk\k,$i  ls  the  usual 
variance  which  is  recursively  computed  by  the  Kalman  filter  under  hypothesis  Hi. 

represents  the  price  of  model  uncertainty.  It  represents  the  performance 
degradation,  or  additional  error,  due  to  the  fact  that  the  model,  characterized  by 
0i,  may  not  match  the  actual  system  that  produced  the  measurements. 


(5.37) 


(5.38) 
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The  estimator  in  (5.36)  is  the  weighted  sum  of  least  squares  estimators  with 
the  weights  being  the  a  posteriori  probabilities  of  the  various  hypotheses.  Because 
the  cost  of  incorrect  decisions  is  the  same  regardless  of  the  decision,  this  estimator 
is  independent  of  the  decision. 

5.2  JD/E  for  Systems  in  Gaussian  Noise 

For  the  measurement  model  (5.1),  if  the  process  and  measurement  noise  are 
assumed  to  be  Gaussian,  the  density  function  p(z* |0,-,  Z*_i)  used  in  (5.33)  can  be 
expressed  as 

p(zt|Z*_i,0j)  =  |5t|fc_1)fl|.r1/2exp{-l/2zi|ifc_1|j.5fc|fc_1>fl.zJ'|fc_ltf.}  (5.40) 

where 

«*fc-  h*(x*|t_M Oj  (5.41) 

and 

=  Hk(Xk\k-l,0p  0i)Pk\k-l,9iHk(Xk\k-l,ep  &i)T  +  (5-42) 
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5.3  JD/E  in  Non-Gaussian  Noise 


Two  non-Gajssian  distributions  that  are  be  of  interest  in  subsequent  chapters 
are  the  Weibull  and  lognormal.  The  high  order  filter,  developed  in  Chapter  3,  is 
used  to  perform  parameter  estimation  in  non-Gaussian  noise.  In  the  case  of  Weibull 
measurement  noise,  the  a  posteriori  pdf,  given  the  estimate  Xt\k-i,0-,  can  be  updated 
as  follows  for  a  scalar  measurement  model  z* 


Pw(zt\Zk 


=  a—  ( 
O%0  \ 


+  Pv>\ 
<Jw  ) 


a-1 


exp 


{-p^n  (-) 


The  parameter  a  is  a  known  constant  that  controls  the  skewness  of  the  distribution. 
When  a  =  2  the  Rayleigh  distribution  results.  pw  is  the  mean  of  the  noncentral 
Weibull  distribution.  The  nik  noncentral  moment  of  the  Weibull  distribution  is 
given  by 

;  +  /<.)*]  =  (5-46) 

where  r(.)  is  the  Gamma  function.  Since  E[zk =  0, 

=  T  (~^-)  *«,  (5.47) 


The  variance  of  is  given  by 

ak\k-l,9i  = 


(5.48) 


Given  the  the  parameter  a  and  the  variance  can  be  found  from  (5.48), 

and  pw  can  subsequently  be  found  from  (5.47). 


If  a  scalar  error  Zk\k-\,0^  >s  centrally  distributed  according  to  the  lognormal 
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distribution,  its  density  function  is  updated  by 

1 


Pi(z*|Z*_i,0,)  = 


+  Hi) 


CXP  {~2of  (In(5fc|fc-1A  +  W)  “  'y)2} 


(5.49) 


The  mean  and  variance  of  the  lognormal  distribution  are 

Hi  -  E[h\k-i,0i  +  Hi) 

=  exp(7  +  of /2) 

and 


(5.50) 


(5.51) 


=  exp(27  +  2 crj)  -  exp(27  +  of) 

Thus,  given  the  variance  st and  the  parameter  7  for  the  lognormal  distribution, 
<7/  can  be  obtained  from  (5.51),  and  m  can  be  subsequently  obtained  from  (5.50). 

5.4  JD/E  with  Model  Uncertainty 


Consider  the  situation  in  which  several  different  hypotheses  are  to  be  eval¬ 
uated  where  the  measurement  model  for  each  hypothesis  may  not  be  a  function  of 
all  the  elements  of  the  state  vector  x*.  For  example,  let  x*  consist  of  two  subvec¬ 
tors  Xifc  and  X2k  ,  such  that  x*  =  [xffc :  x^]7'.  Furthermore,  let  the  measurement 
equation  as  a  function  of  =  [tf;i  t?,-2]r  be  given  by 


Hi  :  zk  =  hk(xkA)  +  Vk 


=  tfilhiJk(xiJt)  +  t?t2h2jk(x2jk)  +  v* 


(5.52) 


Thus  h*  is  segmented  into  two  separate  models,  hj4  and  h\k.  Each  model  is  a 
function  of  a  subset  of  the  state  x*.  Let  us  consider  four  hypotheses.  Under  hy¬ 
pothesis  Hq  no  signal  is  present.  Under  hypotheses  Hi  and  /f2,  signals  hit  and 
h2fc  are  present,  respectively.  Hypothesis  H3  represents  the  situation  where  both 
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signals  are  present.  The  possible  values  of  the  parameter  vector  6  corresponding  to 
the  four  different  hypotheses  are  given  by  Oq  =  [00]r,  9\  =  [10]T,  $2  =  [01]T,  and 
9o  =  [ll]r.  The  goal  is  to  find  the  model  that  best  fits  the  received  data  z*.  For  this 
situation  it  is  not  proper  to  use  the  optimal  estimate  represented  by  equation  (5.36) 
since  all  state  variables  cannot  be  estimated  by  each  model.  Under  these  conditions 
the  maximum  a  posteriori  (MAP)  decision  rule  is  used.  The  model  0,  is  chosen 
such  that  P(6i\Zk)  =  ma x{P(6j\  j  =  0,  •  •  • ,  A/},  where  M  +  1  is  the  number  of 
hypotheses,  including  the  null  hypothesis.  Note  that  for  the  example  given  above, 
if  Hq  is  chosen  no  estimate  is  required.  If  H\  is  chosen,  then  Xifc  can  be  estimated. 
If  H-2  is  chosen,  then  X2t  can  be  estimated,  and  if  H3  is  chosen, the  full  state  vector 
x*  is  estimated.  Chapter  6  uses  MAP  estimation  to  perform  model  order  selection 
for  a  system  of  sinusoids  in  Gaussian  and  non-Gaussian  noise. 

If  the  dimensionality  of  the  state  vector  is  different  between  models,  or  if 
state  variable  assignments  are  different  between  the  various  hypotheses,  or  if  some 
of  the  state  variables  are  unobservable  between  models,  then  the  optimal  estimate, 
represented  by  equation  (5.36)  no  longer  applies.  That  is,  if  not  all  variables  can  be 
estimated  under  each  hypothesis  (i.e.  under  each  model),  then  it  is  not  proper  to 
form  a  combined  estimate  by  summing  the  estimates  from  each  model  weighted  by 
the  a  posteriori  probability  of  each  hypothesis. 

5.5  JD/E  with  Uncertain  Initial  Conditions 

Joint  detection/estimation  theory  may  also  be  applied  to  systems  in  which 
the  measurement  model  is  the  same  among  all  of  the  hypotheses  but  in  which  the 
different  hypotheses  axe  used  to  distinguish  different  sets  of  initial  conditions.  This 
can  be  very  useful  for  nonlinear  estimation  problems  since  it  is  well  known  that 
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the  performance  of  approximate  estimators  such  as  the  extended  Kalman  filter  is 
susceptable  to  errors  in  the  initial  estimates.  This  is  due  primarily  to  the  first  order 
Taylor  series  expansion  used  in  the  filter. 

Consider  the  scalar  case  in  which  the  initial  estimation  error  on  the  state 
variable  is  uniformly  distributed  in  [— w/2,  wf  2]  with  mean  x0|o-  Let  the  width  of 
the  uniform  distribution  be  w.  The  initial  variance  is  pojo  =  u>2/ 12.  Now  consider 
the  situation  in  which  two  models  are  used  for  the  initial  conditions,  with  each  model 
accounting  for  one  half  of  the  uncertainty  in  the  original  model.  In  the  first  model, 
represented  by  0i,  the  initial  estimate  has  mean  £o|o,0i  =  £o|o  ~  W4-  This  mean  is 
in  the  center  of  the  left  half  of  the  original  distribution  for  x0|0.  The  width  of  the 
uniform  distribution  for  model  1  is  wf  2,  and  the  initial  estimation  error  variance  is 
Polo,*!  —  «j2/48.  Similarly  the  mean  and  variance  of  the  initial  estimation  error  for 
model  2  are  xo|of*2  =  *o|o  +  wf 4,  and  Po|o,*2  =  ^V48- 

For  this  example  let  0\  =  — 1,  and  02  =  1.  The  two  hypotheses  can  then  be 
represented  as 

Hi  ■  **  =  hk(xk)  +  vk 

x0\0,«i  =  *0|0  +  0.V4  (5.53) 

Po|o,*,  =  «>2/48 

With  this  model  the  performance  of  the  extended  Kalman  filter  is  likely  to 
be  significantly  more  stable,  as  the  initial  estimation  error  variance  is  reduced  by  a 
factor  of  4  compared  to  the  original  model.  Since  the  state  variables  are  the  same 
for  each  model  the  optimal  estimate  described  by  (5.36)  can  be  implemented.  The  a 
priori  probabilities  P(0i)  are  obtained  by  integrating  the  original  initial  density  func¬ 
tion  over  the  limits  used  to  partition  the  initial  error  into  the  separate  hypotheses. 


143 


For  the  example  given  above  P(&i)  =  P{&2)  =  0.5. 

This  procedure  could  be  particularly  useful  whenever  the  density  function 
for  the  initial  estimation  error  is  multimodal.  A  filter  could  be  constructed  for  each 
mode  of  the  density  function  thereby  greatly  reducing  the  initial  error  variance. 

As  the  number  of  partitions  increases  the  joint  detection/estimation  problem 
becomes  one  of  detection  only.  That  is,  the  initial  estimation  error  becomes  so  small 
that  the  implementation  of  the  filter  does  not  improve  the  estimate.  Thus  there  is 
a  tradeoff  between  estimation  accuracy  and  the  computational  burden  imposed  by 
the  implementation  of  several  extended  Kalman  filters  in  parallel. 

Chapter  7  uses  joint  detection/estimation  for  estimation  of  radar/sonar  signed 
parameters  in  which  the  measurement  model  is  the  same  for  each  hypothesis,  but 
the  initial  estimates  of  the  time  delay  and  Doppler  shift  distinguish  the  hypotheses. 

5.6  JD/E  with  Model  Uncertainty  and  Uncertain  Initial  Conditions 

The  two  estimation  procedures  discussed  in  the  previous  two  sections  can 
be  used  together  to  perform  multiple  hypothesis  testing  and  for  each  hypothesis  to 
have  several  sets  of  initial  estimates.  An  application  of  this  is  given  in  Chapter  8 
where  radar/signal  parameters  are  estimated  from  multiple  sensor  measurements. 

5.7  Summary 

A  general  procedure  for  joint  detection/estimation  has  been  presented.  It  is 
shown  that  this  procedure  may  be  used  to  segment  the  initial  conditions  of  a  estima¬ 
tion  problem  effectively  controlling  unstable  behavior  that  characterizes  nonlinear 
filtering  techniques  such  as  the  extended  Kalman  filter  in  the  presence  of  large  initial 
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uncertainty.  It  is  also  shown  that  the  joint  detection/estimation  procedure  can  be 
used  for  estimation  problems  with  model  uncertainty.  In  the  following  chapters  this 
procedure  is  applied  to  specific  signal  processing  problems. 
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Chapter  6 

Joint  Detection/Estimation  for  Model  Order  Selection 

This  chapter  presents  a  general  approach  to  determining  the  number  of  si¬ 
nusoids  present  in  measurements  corrupted  by  additive  white  Gaussian  and  non- 
Gaussian  noise.  The  approach  involves  the  simultaneous  application  of  maximum 
a  posteriori  (MAP)  detection  and  nonlinear  estimation  using  either  the  extended 
Kalman  filter  when  the  noise  is  Gaussian,  or  the  extended  high  order  filter  (EHOF) 
when  the  noise  is  in  non-Gaussian.  The  problem  is  formulated  as  a  multiple  hypoth¬ 
esis  testing  problem  with  assumed  known  a  priori  probabilities  for  each  hypothesis. 
Each  hypothesis  represents  a  different  measurement  model.  The  unknown  parame¬ 
ters  for  each  model  are  estimated  recursively  along  with  the  a  posteriori  probability 
of  the  hypothesis.  The  general  technique  for  joint  detection/estimation  is  presented 
in  Chapter  5. 

Other  order  selection  methods  [60  -  62]  take  the  form  of  a  function  of  the 
hypothesized  number  of  parameters  which  penalizes  over-estimation  of  the  actual 
number  of  parameters  when  added  to  the  log-likelihood  function.  A  technique  de¬ 
scribed  Fuchs  [63]  uses  eigenvector  decomposition  of  the  estimated  autocorrelation 
matrix  and  is  based  on  matrix  perturbation  analysis.  In  all  of  the  autocorrelation 
techniques  the  additive  noise  is  assumed  to  be  Gaussian.  Rao  and  Vaidyanathan 
[64]  use  cumulant  based  approach  to  estimate  model  order  in  non-Gaussian  noise. 
In  contrast  to  these  methods,  the  technique  used  in  this  chapter  is  based  on  Bayes’ 
theorem.  The  advantage  of  this  technique  is  that  it  can  be  used  in  both  Gaussian 
and  non-Gaussian  noise.  It  is  completely  general  in  that  it  applies  to  arbitrary 
density  functions. 
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A  typical  method  for  determining  the  number  of  sinusoids  present  in  a  re¬ 
ceived  signal  is  to  form  a  model  using  all  of  the  bins  in  the  FFT  as  the  maximum 
number  of  sinusoids  present  in  the  signal.  If  it  is  assumed  that  there  is  one  sinusoid 
present  in  the  measurement,  then  the  number  of  hypotheses  to  be  tested  is  N  (the 
number  of  bins  in  the  FFT).  If  there  are  two  sinusoids  present,  then  Nl/(2(N  —  2)!) 
hypotheses  must  be  tested.  If  it  is  unknown  whether  one  or  two  sinusoids  are  present, 
then  N  +  N\/(2(N  —  2)!)  hypotheses  must  be  tested.  The  obvious  disadvantage  of 
this  approach  is  the  exponential  computational  complexity  in  testing  all  hypotheses. 
In  addition,  the  resolution  of  the  frequencies  is  limited  by  the  bin  size  of  the  FFT. 

The  method  used  in  this  chapter  assumes  that,  if  one  sinusoid  is  present, 
then  the  estimates  of  the  amplitude  and  frequency  are  known  within  some  known 
mean  and  variance,  that  is,  the  distribution  of  the  initial  estimation  error  is  assumed 
to  be  known.  The  procedure  also  allows  for  time  varying  variables  (not  allowed  in 
the  FFT  method). 

Simulation  results  are  presented  for  the  estimation  of  up  to  four  sinusoids 
in  white  Gaussian,  and  non-Gaussian  noise,  when  the  actual  number  is  two.  In 
Gaussian  noise  the  extended  Kalman  filter  is  used  to  perform  estimation.  In  non- 
Gaussian  noise  the  high  order  filter  (EHOF)  developed  in  Chapter  3  is  used  to 
perform  estimation. 

6.1  Joint  Detection/Estimation  Applied  to  Model  Order  Selection 

The  problem  of  model  order  selection  can  be  cast  into  the  framework  of 
joint  detection /estimation  with  model  uncertainty.  Section  5.4  describes  the  gen¬ 
eral  solution  for  joint  detection/estimation  problems  with  model  uncertainty.  Con¬ 
sider  the  situation  in  which  several  different  hypotheses  are  to  be  evaluated  where 
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the  measurement  model  for  each  hypothesis  may  not  be  a  function  of  all  of  the 
state  variables  in  the  vector  x*.  For  example,  let  x*  consist  of  P  subvectors 
such  that  Xfc  =  [x^ix^:  •••  :xpt]r.  Define  the  binary-element  parameter  vec¬ 
tor  6i  =  [i?»j  t?»2  with  =  Oorl.  The  measurement  model  is  given 

by 


Hi  Zk  =  gk(x*,0i)  +  v* 


P 


=  £  *ipZpk{xpk)  +  v* 
p=  i 


(6.1) 


with  plant  equation 


(6.2) 


with  initial  state  estimate  x^o#,  and  initial  state  covariance  PO|Oi0.  The  initial 
state  estimate,  the  measurement  noise,  and  the  process  noise  are  uncorrelated.  The 
process  and  measurement  noise  are  zero  mean  and  distributed  with  covariances 
£[wMwm1  =  Qk,e,  and  P[vMvJ*]  =  PM. 


Hence,  1 9ip  =  1  indicates  the  presence  of  the  pth  term  in  the  ith  model;  and 
t?,p  =  0  indicates  its  absence  from  the  model,  g*  is  segmented  into  P  separate 
models,  with  each  model  being  a  function  of  some  subset  of  the  state  x*.  Under 
hypothesis  Ho  no  signal  is  present  and  Oq  =  [0  0  •  •  •  0]7’.  There  are  P  different 
possible  combinations  of  one  model  only.  The  number  of  different  combinations  for 
more  than  one  model  is  obtained  from  the  binomial  expansion.  The  total  number 
of  different  models  that  can  be  accommodated  by  the  measurement  equation  (6.1) 
is 

JL  p\ 

N,=l^^wz 

The  goal  is  to  find  the  model  that  best  fits  the  received  data  Zfc,  i.e.  to  select 
the  parameter  vector  that  gives  the  best  fit.  Given  the  measurements  modeled  as 


P)- 
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(6.1),  for  model  order  selection  it  is  not  proper  to  use  the  MMSE  estimate  rep¬ 
resented  by  equation  (5.36),  since  all  state  variables  cannot  be  estimated  by  each 
model.  Under  these  conditions  the  maximum  a  posteriori  decision  rule  should  be 
used  for  model  selection  according  to: 

Choose  Hi  :  6{  =  argmax^g©  P(0m  \Zk)  m  =  0,  •  •  •  ,  Af  (6.3) 


where  M  -f  1  is  the  total  number  of  hypotheses  tested.  The  recursion  for  P{0m  |Z*) 
from  Chapter  5  is 


PiPi\Zk)  = 


P(gt[Zt_a)A;(zt) 


(6.4) 


where  A,(z*)  is  the  likelihood  ratio 


Ai(z*)  = 


p(zt|Zt-i,g,) 
^o)  ’ 


(6.5) 


and  where  Z*_a  =  {za,Z2,  •  •  •  z*_i}.  The  initial  condition  for  (6.4)  is  the  a  priori 
probability  density  function  p{$)  =  p(0 |Zo),  which  is  assumed  to  be  known.  The 
conditional  probabilities  p(**|Z*_j,0,)  are  updated  using  the  EKF  or  the  EHOF  as 
described  below. 


6.2  General  System  Model  for  Model  Order  Selection 


Consider  the  problem  of  estimating  the  parameters  of  P  unknown  sinusoids 
from  K  measurements.  The  scalar  measurement  model  for  hypothesis  Hi  is  given 
by 

P 

=  #ipCpk  exp  (~aPk k )  sin(w?Jt  k  +  <j>n)  +  vk  (6.6) 

p=i 

for  k  =  0, 1,  •  •  • ,  K — 1.  It  is  assumed  that  the  frequencies  uPk  are  normalized 
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so  that  the  effective  sampling  interval  is  one  second.  is  assumed  to  be  a  white 
noise  sequences  each  with  variance  a2.  The  objective  is  to  estimate  some  or  all  of 
the  4 P  possibly  time  varying  parameters  in  this  system  based  on  the  measurements. 
Define  the  elements  of  the  state  variable  subvector  xPk  as 

xPk  (1)  =  uPk 

xPk$)  =  °Pk 

(6.7) 

xp\ fc(3)  =  4>p^ 

xPk  (^) =  aPk  • 


6.3  Model  Order  Selection  Experimental  Evaluation 


In  this  section  the  performance  of  the  joint  detection  /estimation  method  is 
evaluated  experimentally.  The  number  of  sinusoids  is  unknown  except  for  an  upper 
bound.  Furthermore,  it  is  assumed  that  the  damping  coefficients  and  phases  are  all 
equal  to  zero.  The  amplitudes  and  frequencies  are  assumed  to  be  either  known  or 
unknown.  When  they  are  unknown,  estimates  of  them  are  obtained  along  with  the 
model  order  selection.  Assuming  the  unknown  number  of  sinusoids  to  be  four,  the 
measurement  equation  becomes 

Zi  =  gfc(xfc,  0i)  +  vt 

_i,  (6.8) 

=  £  *ipSpk(Xpk)  +  v* 
p=i 


where 


S,k(xrk)  =  <r* 


(6.9) 
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The  state  variables  are  defined  a s 

xPk  ( ^ )  =  uPk 
xPi(2)  =  °Tk 


(6.10) 


The  states  axe  assumed  to  be  constant  with  random  initial  values  so  that  the 
plant  equation  becomes 

=  x*.  (6.11) 

Since  P^k-ifi^  =  Pk-\\k-i,9i  the  extrapolation  equation  is  redundant  and  the  EKF 
equations  (section  2.4.1)  simplify  to 

k-1,8,  =  HkXtfk-lfii  +  v* 

*i|M,  =  Xk-i\k-i,e,  +  KkZk 

Sk\k-l,$i  =  GkPk-llk-lfiGk  +  Rk  (6.12) 

Kk  =  pk-i\k-i,0iGk  sk\k-i,«i 
Pk\k,9i  =  {!•  -  KkGk)Pk~l\k-l,9i 

where  the  first  partial  derivative  of  the  measurement  nonlinearity  (6.8,  6.9)  under 
hypothesis  Hi  is  given  by 


Gk  = 


dgi*  .  a  dg2k  .  a  dg3k  .  .  dg4k 

OTCi^  CFX  2j^  OTC3^ 


(6.13) 


where 


n  T 


(6.14) 


dgpk  _  \xPk(2)kc°B(xPk(i)ky 

d*Pk  8^(1^  (l)fc) 

For  estimation  in  Gaussian  noise  ik\k-\,9i  and  from  (6.12)  are  used  in  (5.44) 

for  computation  of  the  a  posteriori  probability  (6.4). 


If  the  noise  is  non-Gaussian,  the  EHOF  equations  from  Chapter  3  are  used  for 
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estimation.  For  the  EHOF  the  innovations  vector  and  its  variance 

are  obtained  in  the  same  manner  as  they  are  in  the  EKF  (6.12).  However,  in  the 
EHOF  the  update  equations  for  the  filter  variance  Pk{k-i,9t  (3.30  or  3.54)  are  much 
more  complicated  than  they  are  for  the  EKF.  In  the  experimental  analysis  that 
follows  the  measurement  noise  was  distributed  according  to  Weibull  and  lognormal 
distributions.  The  form  of  the  density  functions  p(zfc|Z*_i,0j),  which  is  required  in 
(6.5),  is  given  in  section  5.3  for  these  distributions. 

Five  separate  models  are  considered  in  the  experimental  evaluation.  In  the 
first  model  it  is  assumed  that  no  signal  is  present,  and  0o  =  [0  0  0  0],  corresponding 
to  the  null  hypothesis.  The  other  four  models  correspond  to  the  hypotheses  that 
one,  two,  three  or  four  different  sinusoids  are  present  in  the  measurements.  The 
parameter  vectors  for  these  models  are  given  by 

01  =  (1  0  0  0] 

02  =  (1  1  0  0] 

(6.15) 

03  =  [1  1 1  0] 

04  =  [1  1  1  1] 

The  a  priori  probability  is  chosen  to  be  the  same  for  each  model,  i.e.  P(0,|O)  = 
1/5,  t  =  0,  •••  ,5.  The  measurements  (6.9)  are  modeled  using  four  sinusoids  with 
amplitudes  Cpfc  =  1,  for  p  =  1,  •••  4,  and  normalized  frequencies  =  0.12  *  2x, 
u>2k  =  0.22  *  2 x,  u>3k  —  0.32  *  2x,  and  u>ik  =  0.42  *  2x.  In  the  actual  data  only  the 
first  two  sinusoids  are  present,  namely  the  ones  with  frequencies  u>\k  and  u> 2*.  The 
hypotheses  are  indexed  according  to  the  number  of  sinusoids  assumed  present  in  the 
data.  The  actual  model  corresponds  to  hypothesis  H-i. 

Three  separate  scenarios  are  evaluated.  In  the  first  scenario  it  is  assumed  that 
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there  is  no  initial  estimation  error  in  the  state  variables,  and  therefore,  estimation 
is  not  required.  This  corresponds  to  the  detection-only  case.  In  the  second  scenario 
only  the  frequencies  are  estimated.  For  this  scenario  equation  (6.9)  is  modified  ac¬ 
cordingly  to  include  only  frequency  variables.  In  the  third  scenario,  both  frequencies 
and  amplitudes  are  estimated.  Performance  of  the  technique  is  evaluated  via  Monte 
Carlo  simulations  for  having  Gaussian,  Rayleigh,  and  lognormal  measurement 
noise.  The  parameter  a  =  2  is  used  for  the  Rayleigh  distribution  (equation  (5.45)), 
and  7  =  0  is  used  for  the  lognormal  distribution  (equation  (5.49)).  In  the  scenar¬ 
ios  in  which  the  EHOF  is  employed,  the  2nd,  3r<*,  and  4**  order  statistics  of  the 
measurement  noise  are  used  in  the  filter  implementation. 

The  detection  results  for  scenario  1,  the  detection-only  case,  are  presented  in 
Tables  6.1,  6.2,  and  6.3  for  detection  in  Gaussian,  Rayleigh,  and  lognormal  noise, 
respectively.  These  tables  contain  the  number  of  detection  decisions  for  each  model. 
The  column  labeled  P(02|Z*)  gives  the  average  a  posteriori  probability  of  the  hy¬ 
pothesis  H2  for  those  simulation  runs  which  chose  H2  as  having  the  highest  a  poste¬ 
riori  probability.  The  results  are  shown  as  a  function  of  signal  to  noise  ratio  (SNR) 
and  the  probability  density  function  (pdf)  type  used  in  computing  the  likelihood 
ratio  in  equation  (6.0).  The  SNR  is  defined  as  101og(e£t/(2<r*)),  where  is  the 
measurement  noise  variance.  Since  the  amplitude  is  equal  to  one  for  all  sinusoids, 
the  SNR  is  101og(l/(2<7^)).  In  Table  6.1  only  the  measurement  noise  is  Gaussian 
and  only  the  Gaussian  pdf  is  used  to  propagate  the  a  posteriori  probability.  In  Table 
6.2  the  noise  is  Rayleigh,  and  the  a  posteriori  probability  is  computed  using  both 
the  Rayleigh  and  Gaussian  densities.  Table  6.3  shows  the  results  in  lognormal  noise. 
Tables  6.2  and  6.3  illustrate  the  importance  of  choosing  the  proper  density  function 
to  make  the  detection  decision. 
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Table  6.1.  MAP  Decisions  as  a  Function  of  SNR 
Gaussian  Noise  -  Detection  Only 


SNR(dB) 

pdf  Type 

Ho 

Hi 

h2 

Hz 

Hi 

mizt) 

-5 

Gaussian 

3 

15 

171 

10 

1 

0.865 

0 

Gaussian 

n 

2 

196 

2 

0 

0.991 

5 

Gaussian 

0 

0 

200 

0 

0 

1.0 

Table  6.2.  MAP  Decisions  as  a  Function  of  SNR 
Rayleigh  Noise  -  Detection  Only 


SNR(dB) 

Pdf  Type 

-5 

Gaussian 

Rayleigh 

0 

Gaussian 

Rayleigh 

5 

Gaussian 

Rayleigh 

Hi 

Hi 

16 

169 

8 

180 

3 

197 

0 

199 

0 

200 

0 

200 

m \zk) 


0.872 

0.925 


0.991 

0.999 


1. 

1. 
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Table  6.3.  MAP  Decisions  as  a  Function  of  SNR 
Lognormal  Noise  -  Detection  Only 


SNR(dB) 

Pdf  Type 

Ho 

Hi 

h2 

H3 

Ha 

m  iz*) 

-5 

Gaussian 

2 

15 

174 

8 

1 

mm 

Lognormal 

0 

0 

200 

wm 

0 

■B 

0 

Gaussian 

0 

2 

197 

m 

D 

0.992 

Lognormal 

0 

D 

200 

0 

1.0 

5 

Gaussian 

0 

n 

200 

0 

B 

1 

Lognormal 

0 

B 

200 

n 

B 

1 

For  scenario  2  it  is  assumed  that  the  signal  amplitudes  aPk  are  known.  The 
frequencies  uPk  are  estimated  for  each  model.  The  standard  deviation  of  the  initial 
estimation  error  for  the  frequency  in  each  model  is  a  =  0.1.  Table  6.4  shows  the 
number  of  times  each  hypothesis  is  chosen  as  a  function  of  signal  to  noise  ratio 
(dB)  when  the  measurement  noise  is  Gaussian  and  the  measurements  are  processed 
with  the  EKF.  The  EHOF  gives  the  same  results  as  the  EKF  whenever  the  noise 
is  Gaussian.  Tables  6.5  and  6.6  show  the  results  whenever  the  measurement  noise 
is  Rayleigh  and  lognormal,  respectively.  The  development  of  the  EKF  is  based  on 
the  fact  that  the  filter  error  is  a  first  order  function  of  the  innovations  process. 
Thus,  only  first  and  second  order  statistics  are  necessary  for  EKF  implementation. 
Therefore  the  EKF  provides  an  optimal  solution  in  Gaussian  noise  (providing  the 
Taylor  series  approximation  is  valid).  Although  the  EKF  does  not  give  optimal 
performance  in  non-Gaussian  noise,  it  is  evaluated  in  Tables  6.5  and  6.6  in  order  to 
compare  its  performance  to  the  EHOF  in  non-Gaussian  noise.  The  EKF  is  evaluated 
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in  two  configurations  in  these  tables.  The  rows  in  these  tables  labeled  ’EKFS’  denote 
the  performance  of  the  EKF  in  which  density  function  in  (6.5)  is  evaluated  using  a 
Gaussian  density.  ’EKFr’  and  ’EKF;’  correspond  to  the  performance  of  the  EKF 
in  which  the  Rayleigh  and  lognormal  density  functions  are  used  for  computation  of 
the  likelihood  ratio.  The  EHOF  employs  the  appropriate  pdf  associated  with  the 
measurement  noise. 

This  data  demonstrates  that  the  nonlinear  filtering  techniques  can  give  ex¬ 
cellent  performance  for  model  order  selection.  Tables  6.5  and  6.6  demonstrate  that 
the  detection  error  probability  for  the  EHOF  is  lower  than  that  for  the  EKF  in  non- 
Gaussian  noise,  especially  when  the  EKF  is  used  in  conjunction  with  the  Gaussian 
density  function.  The  EKF  performs  much  better  whenever  the  proper  (Rayleigh  of 
lognormal)  density  function  is  used.  Furthermore,  the  EHOF  decides  with  a  higher 
confidence  than  the  EKF,  as  demonstrated  by  the  a  posteriori  probability  P{9 2|Z*). 
This  difference  occurs  primarily  at  low  values  of  the  SNR. 

The  EHOF  performs  better  relative  to  the  EKF  in  lognormal  noise  than  it 
does  in  Rayleigh.  This  is  due  to  the  fact  that  the  lognormal  noise  has  a  higher 
degree  of  skewness  than  does  the  Rayleigh  noise.  That  is,  the  EHOF  has  more  of 
an  advantage  whenever  the  higher  order  statistics  are  large  relative  to  what  they 
would  be  in  Gaussian  noise. 


156 


Table  6.4.  MAP  Decisions  as  a  Function  of  SNR 
Gaussian  Noise  -  Frequencies  Estimated 


SNR(dB) 


-5 


Filter 

Ho 

Hi 

h2 

EKF 

46 

45 

107 

EKF 

■n 

15 

180 

EKF 

wawm 

200 

Ha  P(62 \Zk) 


Table  6.5.  MAP  Decisions  as  a  Function  of  SNR 
Rayleigh  Noise  -  Frequencies  Estimated 


SNR(dB) 

Filter 

Ho 

Hi 

h2 

h3 

Ha 

m \zk) 

-5 

EKF, 

44 

46 

108 

2 

0 

0.844 

EKFr 

23 

28 

135 

6 

8 

0.924 

EHOF 


0.926 


0.975 


0.995 


Table  6.6.  MAP  Decisions  as  a  Function  of  SNR 
Lognormal  Noise  -  Frequencies  Estimated 
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SNR(dB) 

Filter 

Ho 

Hx 

h2 

Hz 

m  izt) 

-5 

EKF, 

41 

47 

104 

1 

B 

0.801 

EKF/ 

27 

30 

129 

11 

H 

0.934 

EHOF 

24 

20 

146 

5 

H 

0.954 

0 

EKF, 

3 

13 

183 

1 

D 

mm 

EKF/ 

1 

12 

181 

6 

0 

KH 

EHOF 

3 

7 

188 

2 

0 

mm 

5 

EKF, 

n 

■■ 

199 

0 

0 

0.998 

EKF/ 

198 

1 

0 

0.999 

EHOF 

H 

0 

200 

KM 

1 

1.0 

Figures  6.1,  6.2,  and  6.3  display  the  sample  variance  of  the  estimation  error 
of  the  two  estimated  amplitudes  as  a  function  of  SNR  for  estimation  in  Gaussian, 
Rayleigh,  and  lognormal  noise.  For  Figure  6.1,  the  sample  variance  is  computed 
only  from  those  trials  in  the  Monte  Carlo  simulation  which  resulted  in  the  EKF 
choosing  the  correct  hypothesis.  Figures  6.2  and  6.3  display  the  sample  variance 
for  those  trials  which  resulted  in  both  the  EKF  and  the  EHOF  choosing  the  correct 
hypothesis.  The  CR  bound  on  the  estimation  error  is  also  shown  in  these  figures. 
A  noise  discrimination  test  is  used  in  an  attempt  to  detect  poor  estimates.  This 
test  involves  discarding  any  estimate  for  which  the  sample  variance  of  the  residual 
>  computed  over  jfc  =  0,  •  •  •  24,  is  greater  than  twice  the  noise  variance  a\. 
The  results  of  using  this  test  are  also  shown  on  Figures  6.1  -  6.3.  The  MSE  of  the 


CR  Bound 


Figure  6.1  Nonlinear  Filter  Performance  for  the  2-State  Model  in  Gaussian  Noise,  Pq  =  0.01 1 


CR  Bound 
EKF 


160 


(3SW/I)3oi  01 


(3SW/l)Soi  oi 


Figure  6.3  Nonlinear  Filter  Performance  for  the  2-State  Model  in  Lognormal  Noise,  Pq  =  0.01  / 
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EHOF  is  slightly  lower  than  the  EKF,  with  both  filters  producing  results  that  are 
close  to  the  Cramer  Rao  bound. 

In  scenario  3  both  the  signal  amplitudes  Cpk  and  frequencies  u>Pk  are  estimated 
for  each  model.  The  standard  deviation  of  the  initial  estimation  is  a  =  0.1  for  both 
frequency  and  amplitude.  Table  6.7  shows  results  for  Gaussian  noise,  and  Tables 
6.8  and  6.9  display  the  results  for  Rayleigh  and  lognormal  noise,  respectively.  These 
results  are  consistent  with  those  in  Tables  6.4  -  6.6  in  that  the  EHOF  makes  better 
detection  decisions  than  the  EKF  in  non-Gaussian  noise.  However,  as  it  can  be 
expected,  the  probability  of  detection  error  increases  whenever  both  frequency  and 
amplitude  are  being  estimated  as  compared  to  when  only  the  frequency  is  estimated. 

The  estimation  results  for  scenario  3  are  given  in  Figures  6.4,  6.5,  and  6.6 
for  estimation  in  Gaussian,  Rayleigh,  and  lognormal  measurement  noises.  Again  it 
is  shown  that  both  the  EKF  and  the  EHOF  perform  close  to  the  CR  bound,  with 
the  EHOF  giving  better  results  than  the  EKF  after  the  noise  discrimination  test  is 
applied. 

Table  6.7.  MAP  Decisions  as  a  Function  of  SNR 
Gaussian  Noise  -  Amplitudes  and  Frequencies  Estimated 


SNR(dB) 

Filter 

Ho 

Hi 

H2 

Hz 

H4 

m  izj) 

-5 

EKF 

49 

51 

95 

5 

D 

0.792 

0 

EKF 

1 

25 

174 

n 

n 

0.971 

5 

EKF 

wm 

0 

200 

0 

0 

1.0 

—  25 
2  to  <n 


MM.  \ 


K^OMO 

uuxua 
□  o  <  +  o 


IQ  Q  lO  O  IQ 
u5  S5  ^  n 

(asw/i)3oi  oi 


8  8  S3  5$  $3  8  2 
(aSK/l)3°l  0T 


_ SNR(dB)  _ SNR(dB) 

Performance  for  c*.  =  1.0  (d)  Filter  Performance  for  ct„  =  1.0 


Table  6.9.  MAP  Decisions  as  a  Function  of  SNR 
Lognormal  Noise  -  Amplitudes  and  Frequencies  Estimated 
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SNR(dB) 

Filter 

Ho 

Hi 

h2 

Hz 

Hi 

P(02  |Zft) 

-5 

EKF, 

44 

52 

98 

6 

0.783 

EKF, 

28 

46 

111 

11 

4 

0.933 

EHOF 

21 

33 

132 

10 

4 

0.936 

0 

EKF, 

3 

18 

178 

1 

0 

0.968 

EKF  , 

1 

14 

179 

6 

a 

0.995 

EHOF 

7 

184 

8 

i 

0.995 

5 

EKF, 

a 

n 

200 

n 

1.0 

EKF, 

0 

H 

198 

2 

H 

1.0 

EHOF 

0 

0 

200 

■ 

0 

0 

1.0 

6.4  Conclusion 

A  general  approach  to  model  order  selection  has  been  presented  based  on  joint 
detection/estimation  theory.  The  approach  involves  the  simultaneous  application  of 
maximum  a  posteriori  detection  theory  and  nonlinear  estimation.  The  approach 
requires  only  an  upper  limit  on  the  model  order  and  is  applicable  to  data  that  are 
being  corrupted  by  additive  Gaussian  and  non-Gaussian  noise.  The  advantage  of 
the  approach  lies  in  the  potential  to  accommodate  time  varying  as  well  as  time 
invariant  parameters  in  the  measurement  model.  Experimental  evaluation  of  the 
approach  demonstrates  excellent  performance  in  selecting  the  correct  model  order 
and  estimating  the  system  parameters  even  in  SNR’s  as  low  as  -5  dB. 
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Chapter  7 

JD/E  Applied  to  Estimation  of  Time  Delay  and  Doppler  Shift 

A  nonlinear  adaptive  detector /estimator  (NADE)  is  introduced  for  single  and 
multiple  sensor  data  processing.  The  problem  of  target  detection  from  returns  of 
monostatic  sensor(s)  is  formulated  as  a  nonlinear  joint  detection/estimation  problem 
on  the  unknown  parameters  in  the  signal  return.  The  unknown  parameters  involve 
the  presence  of  the  target,  its  range,  azimuth,  and  Doppler  velocity.  The  problems 
of  detecting  the  target  and  estimating  its  parameters  are  considered  jointly.  A 
bank  of  spatially  and  temporally  localized  nonlinear  filters  is  used  to  estimate  the  a 
posteriori  likelihood  of  the  existence  of  the  target  in  a  given  space-time  resolution 
cell.  Within  a  given  cell,  the  localized  filters  are  used  to  produce  refined  spatial 
estimates  of  the  target  parameters.  A  decision  logic  is  used  to  decide  on  the  existence 
of  a  target  within  any  given  resolution  cell  based  on  the  a  posteriori  estimates 
reduced  from  the  likelihood  functions.  The  inherent  spatial  and  temporal  referencing 
in  this  approach  is  used  for  automatic  referencing  required  when  multiple  sensor 
data  is  fused  together.  Thus,  the  approach  is  naturally  extendable  to  centralized 
multisensor  data  fusion. 

This  chapter  addresses  the  joint  estimation  of  time  delay  and  Doppler  shift 
from  measurements  of  a  received  signal.  Knapp  and  Carter  [66]  showed  that  the 
ML  estimator  of  time  delay  can  be  represented  by  a  pair  of  prefilters  followed  by 
a  matched  filter.  Stuller  [67]  generalized  these  results  to  obtain  ML  estimates  of 
time  varying  delay,  nonstationary  signals,  and  arbitrary  observation  interval.  An 
extension  of  the  ML  methods  is  given  by  Abatzoglou  [68]  in  which  local  maximization 
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of  the  cross-correlation  function  results  in  fast  ML  estimation.  An  optimum  time 
delay  tracker  based  on  a  first  order  markov  model  is  given  in  [69].  In  [70]  the  same 
model  is  used  for  optimum  time  delay  detection  and  tracking.  These  studies  assume 
linear  plant  and  measurement  equations  so  that  an  optimal  solution  can  be  obtained 
using  the  Bayesian  approach. 

A  general  overview  of  techniques  employed  for  time  delay  estimation  in  sonar 
signal  processing  is  given  by  Carter  [71].  Stein  [72]  describes  how  the  complex 
ambiguity  function  can  be  used  for  joint  estimation  of  time  delay  and  Doppler  shift. 
The  ambiguity  function  approach  is  one  of  the  most  widely  accepted  methods  for 
joint  estimation  of  time  delay  and  Doppler  shift.  The  major  disadvantage  of  this 
approach  is  that  its  implementation  requires  t:?  u^  ihe  Fourier  transform.  Thus 
the  resolution  is  limited,  especially  for  short  data  lengths.  Time  delay  estimation 
has  also  been  approached  using  higher  order  statistics.  Nikias  and  Pan  [73]  and 
Chiang  and  Nikias  [74]  make  use  of  the  fact  that  Gaussian  noise  is  suppressed  in 
the  third  order  cumulant  domain  to  form  estimates  of  time  varying  delay. 

This  chapter  considers  the  problem  of  localizing  a  target  in  a  range- Doppler 
space.  The  range-Doppler  space  is  partitioned  into  a  number  of  resolution  cells. 
Each  cell  is  identified  with  a  hypothesis  that  the  signal  is  present  in  it.  A  joint  de¬ 
tection/estimation  scheme  is  then  used  to  localize  the  target  and  refine  its  parameter 
estimates  (i.e.  time  delay  and  Doppler  shift).  The  measurements  that  are  used  to 
localize  the  target  consist  of  signal  returns  corrupted  by  additive  white  Gaussian 
and  non-Gaussian  noise. 

The  problem  is  formulated  using  the  joint  detection/estimation  procedure 
developed  in  Chapter  5  adapted  to  problems  with  uncertain  initial  conditions.  The 
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approach  involves  the  operation  of  several  nonlinear  independent  filters  in  parallel. 
In  the  case  of  Gaussian  measurement  noise  the  extended  Kalman  filter  is  used  for 
estimation.  The  extended  high  order  filter  (EHOF),  developed  in  Chapter  3,  is  used 
in  non-Gaussian  noise.  The  parallel  filters  are  distinguished  by  the  initial  conditions 
used  to  set  up  the  problem.  Along  with  the  state  estimate  the  a  posteriori  probability 
of  each  hypothesis  is  computed  recursively. 

Two  different  implementations  are  evaluated.  In  the  first  implementation,  the 
model  parameters  for  each  resolution  cell  are  kept  fixed  at  their  a  priori  estimates. 
The  fixed  estimates  are  then  used  to  update  the  a  posteriori  probability  of  each  cell. 
In  the  second  implementation,  the  model  parameters  for  each  resolution  cell  are 
estimated  on-line  and  used  to  update  the  a  posteriori  probability  for  each  resolution 
cell.  After  all  data  is  processed,  the  a  posteriori  probabilities  and  the  initial  estimates 
are  used  to  produce  a  minimum  mean  square  error  (MMSE)  estimate  of  the  time 
delay  and  Doppler  shift. 

7.1  Problem  Statement 

Consider  the  problem  of  signal  detection  and  parameter  estimation  in  the 
context  of  the  reception  of  an  active  echo  return  from  a  object  that  has  been  il¬ 
luminated  by  a  monostatic  source.  The  situation  is  considered  in  which  there  are 
P  collocated  sources  that  illuminate  the  target  simultaneously,  but  with  different 
carrier  frequencies  designated  u>Cp.  The  received  signal  at  each  sensor  is  frequency- 
translated  by  mixing  it  with  a  signal  at  frequency  u The  resulting  signal  is 
low-pass  filtered,  and  digitized  at  a  rate  /,,  which  is  at  least  twice  the  highest  fre¬ 
quency  in  the  data.  The  time  between  samples  is  denoted  t It  is  assumed  that  all 
sensors  have  the  same  digitization  rate,  and  that  all  clocks  are  synchronized.  The 
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general  expression  for  the  received  signal  at  the  pth  sensor  can  be  written 

Zkp  =  akp(Tk)Pkp(Tki  VkVkpiTk,  Vk)  +  Vkp  (7.1) 

where  a,tp(rt)  is  the  received  signal  amplitude,  pkp(Tk,  i/*)  is  the  pulse  shaping  func¬ 
tion,  and 

ntpCnk,  vk)  =  cos  [{vk(uCp  (let,  -  Tjt)))  -  wtpit,]  (7.2) 

vkp  is  white  noise  with  £[utp]  =  0,  E[vkpvj^[  =  o\p6(k— j).  and  rk  is  the  time  delay 
between  signal  transmission  and  reception,  r*  is  a  function  of  the  range  Dk  between 
the  receiver  and  the  object,  and  is  given  by 

2  Dk 

Tk  =  - 

C 

The  Doppler  shift  parameter  vk  is  given  [76]  by 

*-l  +  ^  (7-4) 

where  Vjk  is  the  Doppler  velocity  obtained  by  projecting  the  velocity  vectors  of 
the  target  and  receiver  along  the  line  of  sight  between  them,  and  c  is  the  speed 
of  electromagnetic  propagation.  vk  is  bounded  by  the  perceived  maximum  speed 
of  the  object  and  exact  knowledge  of  the  receiver  platform  speed.  Based  on  these 
capabilities  one  could  postulate  fairly  accurate  representations  for  the  moments  of 
the  probability  density  functions  for  vk-  For  unambiguous  range  estimation  the 
uncertainty  in  rjt,  denoted  At*  is  bounded  by  A  Tk  <  2  x/(vyjJep)'  This  is  due  to 
the  fact  that  the  cos(.)  function  is  not  monotonic  (i.e.  r*p(Ti,i/*)  =  rjtpfo,*'*),  if 
T2-T1-  2*/{VkWcp))- 

Pkp{Tk,  vk)  is  the  pulse  shaping  function,  which  has  average  energy  Ep.  The 
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signal  amplitude  is  attenuated  due  to  spherical  spreading  loss  by  a  factor  of  1  j  D\. 
With  known  transmitted  amplitude  Atp,  the  received  amplitude  as  a  function  of 
time  delay  is  given  by 


4  Atp 

(cn)2 


(7.5) 


Note  that  the  effects  of  filtering,  amplification,  and  digitization  on  the  re¬ 
ceived  signal  amplitude  are  not  considered.  These  effects  are  generally  known  and 
can  be  accounted  for  in  the  amplitude  function  (7.5). 

7.2  Joint  Detection/Estimation 


The  joint  detection/estimation  procedure  for  problems  with  uncertain  ini¬ 
tial  conditions  is  followed  in  this  chapter  for  optimal  estimation  of  time  delay  and 
Doppler  shift.  This  procedure  is  described  in  Section  5.5.  The  hypotheses  are  dis¬ 
tinguished  from  each  other  by  the  initial  conditions  on  the  initial  state  estimates 
i  Xo|o,*|>  and  initial  state  covariances  The  measurement  and  process  mod¬ 

els  are  the  same  for  each  hypothesis.  Let  0,-  €  ©  designate  the  parameter  vector 
that  describes  the  different  initial  conditions  on  the  states.  The  parameter  vector 
0,-  is  also  assumed  to  be  time  invariant.  Under  hypothesis  H$i  the  discrete  time 
measurements  are  modeled  according  to 


Hgi  :  z  t  =  gfc(xt)  +  v* 

with  i.c.’s  Xojo.tfp 


(7.6) 


The  measurement  vector  z*  is  composed  of  the  scalar  measurements  of  the 
P  individual  sensors  such  that 


r 

z*  =  [ztl  Zk 2  2kp] 


(7.7) 
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The  state  Xjt  is  common  for  all  6{  £  Q,  and  satisfies  the  discrete  time  process 
equation 

x*  =  f(x*_1)  +  w*_1  (7.8) 

The  initial  state  estimate,  the  measurement  noise,  and  the  process  noise  are  uncor¬ 
related.  The  process  and  measurement  noise  are  zero  mean  and  distributed  with 
covariances  £[wjtw*]  =  Qt,  and  f?[vtvj^]  =  i2*. 

For  each  0,  £  0  (each  assumed  model),  a  minimum  variance  estimate  of  the 
model  parameters  is  obtained  recursively  using  the  joint  detection/estimation  tech¬ 
nique.  Using  this  technique  a  minimum  variance  estimate  of  the  model  parameters 
is  obtained  for  every  assumed  model.  These  estimates  are  subsequently  used  to  es¬ 
timate  the  likelihood  of  each  model  being  the  correct  one.  Based  on  these  likelihood 
estimates,  a  maximum  a  posteriori  (MAP)  decision  criteria  or  a  minimum  mean 
square  error  (MMSE)  decision  criteria  can  be  used  to  select  the  proper  model. 

Using  Bayes’  rule,  the  a  posteriori  probability  of  the  parameter  vector  0  is 
updated  recursively  by  [67,  68] 

P(0.  |Zt-,)p(st|Zt-i,ft) 

EJLi 

where  Z*_j  =  {zi,Z2,  •••  The  initial  condition  for  (7.9)  is  the  a  priori 

probability  density  function  p(0)  =  p(9 |Zo),  which  is  assumed  to  be  known.  The 
densities  p(z;t|Z*_i,0,)  are  updated  using  the  EKF  or  the  EHOF. 

The  update  procedure  for  measurements  in  Gaussian,  Rayleigh,  and  lognor¬ 
mal  noise  is  described  in  Chapter  5,  sections  5.2  and  5.3  . 


(7.9) 


Since  the  state  vector  x*  is  common  to  all  models,  the  minimum  mean 
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squared  error  (MMSE)  estimate  can  be  used.  The  MMSE  estimate  is  expressed 
as  a  weighted  average  of  the  conditional  state  estmates  .  over  all  Oi  as  follows: 

%  =  EmiZ*)x*|Mr  (7-10) 

1=1 

7.3  Specification  of  Initial  Conditions 

The  localized  initial  conditions  for  each  resolution  cell  are  defined  as  follows: 
Let  the  time  delay  have  mean  fo  and  density  function  prQ{r o).  The  distribution  of  r0 
is  segmented  into  N  nonoverlapping  segments  such  that  the  segment  around  some 
localized  initial  estimate  fRo  is  defined  by 

Prn0(Tn0)  =  Pr0(To)  ««  <  70  <  On+1  1  <Tl<N  (7.11) 

We  have 

E  I  W+1  Prn0(r)dr  =  J  p^dr  =  1 
»=1  Jan  J—oo 

Define  the  scaling  parameters  („  such  that 

<»  /  n+1  PT*0(r)dr  =1  1  <  n  <  N 

Jan 

Then  the  mean  and  variance  of  the  initial  conditions  of  the  segmented  model  are 
given  by 

T»0  =  E[Tn0]  =  c nj  "+1  TPrit  {T)dr 
Jan 

Var[rmol  =  jT"+1  r2pr ^(r)*  -  r;o2 

Similarly,  the  initial  estimate  of  Doppler  shift  have  mean  i>o  and  density  function 
Pvq(vq).  Now  let  the  distribution  for  vo  be  segmented  into  M  nonoverlapping  seg- 
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ments  such  that  the  segment  corresponding  to  initial  estimate  i/mQ  is  defined  by 

Pi'mo^o)  =  Vv 0(^0 )  lm<VQ<  7m+i  1  <  m  <  M  (7.12) 

We  have 

J2  /7m+1P w»0 (*')*'=  f  PvQ{v)du  =  \ 
m=1Am  J- 00 

Define  the  scaling  parameters  Km  such  that 

Km  I  m+1  PvmMdv  =  1  1  <  m  <  M 

Jim  u 

Then  the  mean  and  variance  of  the  initial  conditions  of  the  segmented  model  for  vq 
are  given  by 

i'mo  =  E[i/mQ)  =  nm  I  m+  vpVmo(u)du 

Jim 

Var[i/mo]  =  icm  /7m+1 

J1m 

With  N  different  initial  conditions  on  tq,  and  M  different  initial  conditions 
on  vq  there  are  NM  different  resolution  cells  for  referencing  the  measurements. 
A  different  filter  is  initialized  in  each  resolution  cell.  The  total  number  of  cells, 
MN,  in  the  resolution  space  can  be  large,  depending  on  the  desired  accuracy  in  the 
parameter  resolution.  However,  the  filters  can  be  run  in  parallel,  and  independent 
of  each  other,  thus  reducing  the  execution  time  to  that  of  a  single  filter. 

The  parameter  vector  9{,  i  =  (n  —  1)  *  M  +  m,  1  <  n  <  N,  1  <  m  <  M,  is 
defined  to  be  the  (n,  m)ih  resolution  cell  and  is  used  to  define  NM  initial  conditions 
on  the  state  variables  r  and  v.  The  a  priori  probabilities  of  each  hypothesis  are 
determined  by  integrating  the  density  functions  and  p„0(i/o)  over  the  limits 
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defined  for  each  hypothesis.  They  are  given  by 

P(0i)=  I  "+1  Pt0(t)<1t  rm+1p^)dv  (7.13) 

Jan  J"im 

7.4  Joint  Detection/Estimation  of  Delay  and  Doppler 

The  model  is  now  considered  in  which  both  vk  and  t*  are  unknown,  with  both 
state  variables  to  be  estimated.  The  parameter  vector  Q{  —  [n  m]T,  1  <  i  <  NM  is 
used  to  define  NM  different  initial  conditions  on  the  state  variables  r  and  v.  The 
hypothesis  Hi  that  corresponds  to  for  sensor  p  is  given  by 


Hi 


»k 


kts  <  Tk 


StJh,  h)  +  «k  rn  <  k t,  <  Tt  +  f. 


(7.14) 


vk  kta  >  rt  -f  tw 

where  tw  is  the  pulse  width.  The  initial  conditions  are  given  by 


where 


®0|0,*i  —  l*»o> 

var(r»0] 


Po\o,$i  - 


0 


0  Var[i/mo]J 


(7.15) 


9kP(h,h)  =  akp(Tk)Pkp(Tk,h)rk,(Tk,h) 


(7.16) 
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a*Pfa)  = 


(ch)2 


Vkp{h,  h)  -  0.5  (1  -  cos(2*  vk(kt3  -  Tk)M)  (7*17) 

rkp(h,v k)  =  cos  {(i'ki^cp  (kta  -  h)))  ~  <^ipkt3\ 

The  Hanning  window  is  used  as  the  pulse  shaping  function  pkp(). 

With  the  state  variable  vector  as  x*  =  [t*  vk]T,  the  Jacobian  of  the  measure¬ 
ment  model  is  given  by 


g*  =  Kg12  Glp]: 


(7.18) 


where  the  Jacobian  Gkp  for  the  pih  sensor  ,  that  is  used  in  both  the  EKF  and  the 
EHOF,  is  given  by 

^  dgi„(xt,0,) 

=  — V - 

5X4  **=**-l|*-M, 

r  dakp  _  _  ,  fyp  _  .  * kp  (7‘19) 

_  dxk(l)PkPrb  ■*"  ab dxk(l)rb  akpPkp  dxk(l) 

~  ***>  , 

akpdxk(2)rkp  +  akpPkp  dx^ii) 
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The  partial  derivatives  within  the  brackets  Me  given  by 

dakp  -8  Aip 

3**(1)  “  c2x*(l)3 

-<9~—  -  -^~^-sm(2vxk(2)(kts  -xk(l))/tw)) 


dxk(l) 

drkp 


=  xk(2)uCp  sin  [xfc(2)(a/Cj,(A  ta  -  xk(l)))  ~  utpkta\ 
sin(2xxt(2)(fct4  -  ifc(l))/<w)) 


d**(l) 

dPkp  _x(kta-  xt(l)) 
5xjt(2) 
drk. 


(7.20) 


LW 


P  _ 


dxk(2) 


=  -Vcp(kt9-xk(l)) 

x  sin  [xt(2)(<JCj)(fc<Ji  -  Xjfc(l))  -  a*,**,)] 


The  detection  procedure  consists  of  computing  gk{xkj0i)  and  Gk(x.k,0i)  for 
each  value  of  kta  and  for  each  model  0i.  For  each  model  0,,  if  x*(l)  <  kta  < 
Xfc(l)  +  tw  then  gjt(xjt,0,)  and  Gk(xk,0i)  must  be  computed.  The  equations  for  the 
innovations  and  covariance  Sk are  given  in  [5.41,  5.42]  for  both  the 

EKF  and  for  the  EHOF.  Whenever  the  signal  is  assumed  absent  (x*(l)  <  kta,  or 
kta  >  xjt(l)  +  tw),  the  innovations  become 

**|*-M,-  =  zk 
Sk\k-l,0i  =  Rk 


With  and  P(0»|Z*)  is  computed  using  (7.9).  The  final  state 

estimate  is  then  computed  using  (7.10). 


7.5  Joint  Detection/Estimation  of  Time  Delay 

Under  some  conditions  in  which  vk  and  rk  are  unknown  it  may  be  possible  to 
obtain  improved  estimated  of  only  one  of  these  these  state  variables.  For  example, 
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if  the  pulse  width  is  very  short  the  minimum  variance  estimate  of  vk,  which  can 
be  obtained  from  any  unbiased  estimator,  may  be  larger  than  the  initial  Doppler 
variance.  In  this  case  Doppler  estimation  is  redundant.  This  section  addresses  the 
model  in  which  vk  and  rk  are  unknown,  but  only  the  time  delay  is  to  be  estimated. 
Estimation  of  Doppler  alone  is  addressed  in  the  next  section.  The  parameter  vector 
0j  is  defined  as  before.  To  define  hypothesis  Hi ,  i>k  is  replaced  by  i/mo,  that  is,  the 
Doppler  parameter  does  not  change  from  its  initial  estimate.  Hypothesis  Hi  is  now 
given  by 


Hi  :  zkp 

with  initial  conditions 


where 


vk  kta  <  f* 

(  9kp{h^m0)  +  vk  rk  <kta  <Tk  +  tw 
vk  kts>Tk  +  tw 

*0|0,S,-  =  \^n0,^m0}T 

Pmi  =  [Varlr»ol] 


9kp(h,Vm0)  =  “kp (h )pkp(h ,  *>m0 )rkp(h ,  »>mo ) 


(7.21) 


(7.22) 


(7.23) 


-  (eft)2 

Pkp(h » j>m0)  =  0.5 (1  -  cos(2*  x>m0(fct,  -  **)/<«,))  (7-24) 

rkpihfVmo)  =  cos  [(i>m0(wc,  (kt,  -  ft)))  -Wtpkt$] 

The  state  variable  is  xk  =  [t*],  and  the  Jacobian  equations  (7.19,  7.20) 
now  contain  only  those  terms  that  include  partial  derivatives  of  x*(l).  The  detec¬ 
tion/estimation  procedure  is  then  the  same  as  that  described  previously. 


179 


7.6  Joint  Detection/Estimation  of  Doppler  Shift 

Consider  the  model  in  which  and  t*  are  unknown,  but  only  the  Doppler 
shift  parameter  is  to  be  estimated.  The  parameter  vector  0,  is  defined  as  before.  To 
define  hypothesis  Hi  f*  is  replaced  by  f„0 ,  that  is,  the  time  delay  does  not  change 
from  its  initial  estimate.  Hypothesis  Hi  is  now  expressed  as 

Ivk  kta<fno 

9kp(,TnQi  Vk)  *b  T-n0  ^  ktg  <C.  Tn q  4"  tw 

Vk  kt„>  f„0  +  tw 

with  initial  conditions 

—  lTn0,  ^moJ 

Po\o,8i  =  [Var[rB0]) 

where 

9kp(rn  0,h)  =  akp  (^n0  )Pip  (^»0 1  ^k)rkp  (^n0 ,  h) 

<VT”o)  - 

Pkp{rr io,  h)  =  0.5  (1  -  cos(2x  uk(kt$  -  rno)/tw))  (7*28) 

"k)  =  cos  [(vkiucp  {kt,  -  fno)))  -  U)ipkt,] 

The  state  variable  is  xk  =  (v*],  and  the  Jacobian  equations  (7.19,  7.20) 
now  contain  only  those  terms  that  include  partial  derivatives  of  xk(2).  The  detec¬ 
tion/estimation  procedure  is  described  in  section  7.4. 


(7.25) 

(7.26) 

(7.27) 
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7.7  Detection  Without  Estimation 

For  the  detection-only  problem  we  use  the  initial  estimates  f„0  and  i>mo  in 
the  measurement  equation.  The  state  estimates  remain  invariant.  The  hypothesis 
Hi  that  corresponds  to  parameter  0,  is  defined  as 


where 


f 

kta<  f„0 

Hi  :  zk  =  j  S^Tno^mo)  +  Vk 

'f’nQ  ^  k  tg  <  Tnp  •+-  tw 

\ 

k  tg  ^  T» q  +  tw 

5k(^»0>^mo)  “  ai(^»o  )Pk  (^o )rt(f»0  *  ^"*0 ) 


(7.29) 


(7.30) 


The  procedure  for  determining  the  a  posteriori  probability  is  the  same  as 
that  described  in  the  previous  section  with  the  exception  that  the  state  variables 
are  held  constant  at  their  initial  estimates  t„0  and  vnQ. 

7.8  Experimental  Evaluation 

As  a  prelude  to  the  experimental  evaluation  it  is  useful  to  discuss  the  min¬ 
imum  variance  that  can  be  obtained  through  the  estimation  of  time  delay  and 
Doppler  shift.  Consider  the  measurement  model  for  a  single-frequency  pulse  in 
a  rectangular  window  of  size  K ,  where  K  is  the  number  of  samples  per  pulse.  This 
signal  is  expressed  by 


hk  =  sin(i /uc(kt,  —  r))  0  <k  <  K 


If  the  signal  hk  is  received  in  additive  white  Gaussian  noise  with  variance  a2,  the 
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Cramer  Rao  bound  (Appendix  A)  is  given  by 


Var[r]  > 


1  1 


U 


SNR  u^K 


(7.31) 


Var[i/]  > 


:£r„'*{(£)2} 


SNR  u.?(J(ESr*2) 


(7.32) 


where  SNR  is  the  mean  square  signal  amplitude  divided  by  the  white  noise  variance, 
i.e.  SNR  =  A2/(2<t2).  The  delay  variance  is  reduced  by  increasing  the  carrier 
frequency  u>c  or  by  increasing  the  number  of  samples.  Doppler  variance  is  reduced  by 
increasing  the  carrier  frequency  or  by  increasing  the  pulse  width  (i.e.  by  integrating 
over  a  longer  time).  The  bound  for  time  delay  is  achievable  only  if  the  initial 
uncertainty  At<j  <  l/(2/c).  Measurement  of  time  delay  is  actually  accomplished  by 
measuring  the  phase  of  the  received  signal.  Since  the  phase  is  periodic  at  a  rate  /c, 
two  time  delay  estimates  separated  by  l//e  will  give  the  same  phase  measurement 
for  a  single-frequency  rectangular  pulse.  That  is,  an  initial  estimate  f  >  r  +  l/(2/c) 
is  more  likely  to  converge  to  r  +  l//c  than  it  is  to  r.  The  situation  can  be  improved 
somewhat  by  employing  amplitude  modulation  or  angle  modulation.  However,  as 
shown  in  the  following  section,  window  functions  such  as  the  Hanning  window  do 
not  help  appreciably.  Thus  the  variance  of  time  delay  error  may  be  more  accurately 
bounded  by 


Var[r]  > 


f  SNR  i 2u\n 

\  Var[r0] 


Ar0  <  l/(2/c) 
Ar0  >  l/(2/c) 


(7.33) 


If  the  initial  estimation  error  for  time  delay  is  not  known  to  within  A r  = 
1/(2 fc)  then  parameter  estimators  will  not  do  any  better  than  the  initial  estimates. 
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This  is  especially  true  whenever  the  SNR  is  large  enough  to  nullify  the  effects  of  any 
pulse  shaping.  This  limitation  drives  the  requirement  for  the  number  of  different 
filters  needed  for  accurate  delay  estimation.  If  to  =  LAtq  is  the  total  width  of  the 
uniform  distribution  of  the  initial  delay  error,  then  the  number  of  required  filters  is 

L  =  2/cfo  (7.34) 

For  example  assume  a  typical  radar  operating  frequency  of  10  GHz.  For  unambigu¬ 
ous  range  resolution  the  associated  initial  time  delay  uncertainty  must  be  less  than 
1  x  10-10  seconds,  corresponding  to  a  range  uncertainty,  computed  from  equation 
(7.3),  of  0.05  feet.  It  may  be  more  appropriate  to  discuss  time  delay  estimation 
in  the  context  of  communications  signals  where  the  operating  frequencies  are  much 
lower  and  the  pulse  widths  larger. 

One  method  for  dealing  with  this  problem  is  to  ensure  that  an  initial  es¬ 
timate  is  within  ±l/(2/e)  of  the  actual  time  delay  r.  This  can  be  accomplished 
by  segmenting  the  initial  conditions  and  operating  several  estimators  in  parallel  as 
described  in  Sections  7.2  -  7.6.  This  procedure  is  evaluated  experimentally  in  the 
next  section. 

Another  technique  for  time  delay  estimation  discussed  in  the  radar  literature 
[78,  pp.167-169]  involves  leading-  and  trailing-edge  detection  of  the  envelope  of  the 
received  signal.  The  rise  time  t&  of  the  pulse  is  lower- bounded  by  the  bandwidth 
fg  of  the  received  signal  with  tR  ft*  I//5.  The  receiver  includes  a  bandpass  filter 
of  width  /b,  an  envelope  detector,  and  a  threshold  stage.  For  this  type  of  receiver, 
the  variance  of  the  time  delay  estimate  error  is  lower- bounded  by  [79,  p.  299],  [80, 
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pp.  400-404] 


(7.35) 


where  iVo/2  is  the  noise  power  spectral  intensity,  and  Er  is  the  received  signal  energy. 
The  variance  of  the  frequency  estimation  error,  obtained  from  coherently  processing 
the  received  signal,  is  bounded  by 


Var[uc]  > 


2  Er]’1  1 

.Wo]  tv 


(7.36) 


Noting  that  Var[i/]  =  Var[u;c]/u;g,  (7.36)  is  the  continuous-time  equivalent  of  (7.32). 
The  variance  in  estimating  Doppler  shift  is  reduced  by  increasing  the  signal  pulse 
width  or  by  employing  a  larger  carrier  frequency.  For  envelope  detection  the  variance 
in  estimating  time  delay  is  reduced  by  increasing  the  signal  bandwidth  fg.  The  ideal 
situation  is  to  design  the  signal  to  obtain  good  estimation  of  both  delay  and  Doppler. 
This  is  generally  accomplished  by  employing  amplitude  and/or  angle  modulation 
on  the  pulse.  The  modulation  is  designed  to  produce  large  bandwidth  (low  Var[r]), 
while  a  large  tw  produces  low  Var[wc]. 


Delay  and  Doppler  estimation  are  addressed  separately  in  the  following  ex¬ 
perimental  evaluation  of  the  JD/E  technique.  Delay  estimation  is  evaluated  in  sec¬ 
tion  7.7.1  for  a  signal  with  relatively  small  pulse  width  (large  bandwidth).  Doppler 
estimation  is  performed  in  section  7.7.2  for  a  signal  with  large  pulse  width. 


7.8.1  Time  Delay  Estimation 


Both  single  and  double  sensor  models  (P  =  1,  and  P  =  2)  in  (7.7)  were 
selected  for  experimental  evaluation.  For  this  evaluation  the  sampling  frequency  was 
fa  =  100  x  106  Hz,  the  pulse  width  was  set  to  12 1,  and  c,  the  speed  of  propagation, 
was  186000  miles/sec.  For  all  tests,  the  nominal  time  delay  and  Doppler  were 
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rnom  =  0.000324  and  (i/nom  —  1)  =  8.96  x  10  7  respectively,  corresponding  to  a 
target  at  a  nominal  range  of  10  miles,  traveling  at  300  mph  Doppler  velocity. 

It  was  assumed  that  the  error  in  the  time  delay  estimate  was  uniformly  dis¬ 
tributed  at  ±3.5  fj  about  the  nominal  delay.  The  corresponding  variance  is  then 
(7t»)2/12.  The  error  in  the  Doppler  estimate  was  assumed  to  be  uniformly  dis¬ 
tributed  at  ±7.47  x  10“7  about  the  nominal  Doppler.  This  corresponds  to  an  error 
in  the  Doppler  velocity  of  ±250  mph.  The  corresponding  variance  is  1.85  x  10“13. 

7.8. 1.1  Single  Sensor  Evaluation 

It  is  noted  that  the  model  in  (7.14)  does  not  change  appreciably  for  the 
range  of  values  for  v±  specified  in  Section  7.8.1.  That  is,  the  magnitude  of  the 
partial  derivatives  with  respect  to  arjt(2)  (Doppler  shift)  in  (7.19)  are  much  lower 
than  those  with  respect  to  Xfc(l)  (time  delay).  In  fact  it  was  found  experimentally 
that  the  filter  gain  corresponding  to  the  Doppler  shift  parameter  was  very  small 
resulting  in  negligible  change  in  this  parameter  from  its  initial  estimate.  For  this 
reason  the  results  presented  for  joint  detection/estimation  (JD/E)  are  shown  for 
time  delay  estimation  only.  In  this  case  the  measurement  model  (7.14)  becomes 
*kp  =  9kp{h,  I'nom)  +  vk,  for  r*  <  kt,  <  r*  +  tw,  with  x0|M.  =  f„0,  and  P0|o,S,  = 
Var(rmQ).  Thus,  for  the  JD/E  technique,  Vk  is  held  constant  at  its  initial  estimate 
I'nom-  For  the  single  sensor  evaluation  the  carrier  frequency  was  uc  =  2x  *  10  x  106. 
The  translation  frequency  was  uh  =  0.  Since  the  signal  is  oversampled  (/,  =  10/c), 
it  is  not  necessary  to  translate  the  signal. 

The  single  sensor  model  was  used  to  compare  the  use  of  multiple  filters  (N  = 
7)  to  a  single  filter  ( N  =  1)  for  joint  detection/estimation.  With  only  one  filter, 
*010,*!  =  Tnom,  Polo,*!  =  (7 t,)2/12,  as  described  previously.  The  initial  estimates 
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of  time  delay  for  the  multiple  filter  implementation  are  given  by  t«0  =  (n  -  4)  *  ta  + 
Tnoni)  n  =  1,2,  •  •  •  7.  Thus,  the  initial  delay  estimates  were  separated  by  ta,  with 
Var(rno)  =  t^f  12,  Vn.  The  a  priori  probabilities  are  given  by  P(0„|Zo)  =  1  /N,  1  < 
n  <  N. 

Figure  7.1  illustrates  typical  simulation  results  for  the  JD/E  performance 
with  a  bank  of  seven  filters  ( N  =  7).  This  figure  displays  the  received  signal  z*,  the 
estimation  error  the  error  covariance  Pk\k,9^  aad  a  posteriori  probability 

p(0i\Zk)  for  all  models  0i,i  =  1,  •  •  • ,  7.  The  SNR  was  lOdB  for  this  example.  This 
figure  shows  that  although  the  covariance  converges,  the  estimation  error  does  not 
converge  to  zero  for  all  models.  However,  the  weighting  provided  by  a  posteriori 
probability  allows  the  proper  model  selection. 

The  Monte  Carlo  simulation  results  for  JD/E  with  a  single  filter  (N  =  1)  and 
a  bank  of  seven  filters  (N  =  7)  are  shown  in  Figure  7.2(a).  In  this  figure  the  mean 
square  error  (MSE)  of  the  estimation  error  in  t*  is  shown  as  a  function  of  SNR, 
where  SNR  =  101og(Ea/o^)t  for  t*  <  kts  <  Tk  +  tw,  and  Ea  is  the  average  received 
signal  energy  per  sample.  Each  point  on  the  graph  represents  the  results  of  500 
simulation  runs.  Both  the  MAP  and  MMSE  estimates  are  shown  in  Figure  7.2(a). 
The  MAP  and  MMSE  estimates  are  the  same  for  N  =  1.  Also  shown  on  this  graph 
are  the  results  for  the  detection-only  (D-O)  technique.  The  noise  is  Gaussian,  and 
the  EKF  is  used  to  perform  estimation  in  the  JD/E  method.  The  JD/E  ( N  =  7) 
implementation  gives  better  results  than  the  D-0  method,  particularly  at  higher 
SNR.  This  is  expected  since  the  filter  in  the  JD/E  method  allows  a  considerable 
refinement  estimates  at  higher  SNR  as  compared  to  low  SNR  where  the  larger  noise 
covariance  restricts  the  filter  gain.  At  —5  dB  SNR  the  JD/E  and  P  0  implementa¬ 
tions  perform  identically.  In  general,  the  MMSE  estimates  are  better  than  the  MAP 
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estimates,  particularly  at  low  SNR’s.  The  JD/E  ( N  =  1)  implementation  gives  the 
worst  overall  performance.  The  filter  used  in  this  implementation  often  converges 
to  poor  final  estimates  due  to  the  tendency,  mentioned  previously,  of  time  delay  to 
converge  to  values  that  are  separated  from  the  actual  time  delay  by  multiples  of 
±l//o 


The  importance  of  selecting  an  initial  estimate  within  ±1/(2 /c)  is  illustrated 
in  Figure  7.3.  This  figure  compares  the  JD/E  MMSE  (eqn  7.10)  error  distribution  for 
IV  =  1  to  that  for  N  =  7.  The  distributions  are  formed  from  the  results  of  500  trials 
at  each  of  the  6  SNR  values  — 5dB,  OdB,  •  •  • ,  20dB,  giving  3000  total  observations 
of  the  time  delay  estimation  error.  For  the  JD/E  N  =  1  case,  where  the  initial 
estimation  error  is  allowed  to  range  between  ±3.5<s  (±0.35//c)  for  a  single  filter, 
a  significant  portion  of  the  error  distribution  is  centered  around  1  x  107,  of  1/ /c. 
However,  for  the  N  —  7  case,  in  which  the  initial  error  distribution  is  segmented 
among  the  7  filters,  the  entire  distribution  is  centered  around  0  error.  In  addition, 
the  distribution  around  0  error  for  N  =  7  is  tighter  than  the  distribution  around  0 
for  N  =  1.  This  suggests  that  if  the  number  of  filters  is  chosen  such  that  the  initial 
estimation  error  of  at  least  one  of  the  filters  is  small  in  terms  of  1/(2 fc),  then  the 
JD/E  procedure  can  overcome  the  restriction  on  the  initial  estimation  error  imposed 
by  (7.33). 

The  JD/E  ( N  =  7)  technique  is  evaluated  in  lognormal  noise  in  Figure  7.2(b) 
for  the  single  sensor  model.  The  MMSE  estimates  of  r*  are  shown  in  this  figure  for 
the  EKF  and  for  the  EHOF.  The  EKF  is  evaluated  in  two  configurations.  In  the  first 
configuration,  the  Gaussian  pdf  is  used  to  evaluate  the  detection  statistic  given  by 
equation  (7.9).  In  the  second  configuration,  the  lognormal  pdf  is  used.  The  EHOF 
is  evaluated  using  the  lognormal  pdf  only.  The  EKF  in  the  second  configuration  and 


Figure  7.1  Typical  JD/E  Results  for  N  =  7,  lOdB  SNR 


Figure  7.2  JD/E  Performance  for  Time  Delay  Estimation 
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the  EHOF  give  very  similar  results  at  low  SNR.  However,  at  high  SNR  the  EHOF 
outperforms  the  EKF.  When  the  Gaussian  pdf  is  used  in  conjunction  with  the  EKF 
to  perform  detection,  the  results  are  much  worse  than  when  the  proper  lognormal 
pdf  is  used.  This  advantage  is  particularly  evident  at  low  SNR’s. 

7.8. 1.2  Double  Sensor  Evaluation 

In  the  multiple  sensor  case  ( P  >  1),  the  sensors  may  have  different  carrier 
frequencies  (u>Cp),  and  different  translation  frequencies  (u>tp).  A  two-sensor  (P  =  2) 
model  was  evaluated  in  which  wCj  =  2x  *  10  x  106,  u>C2  =  2n  *  30  x  106,  and 
^iti  =  w<2  =  0.  The  MMSE  results  of  this  evaluation  for  JD/E  (N  =  7)  are  given 
in  Figure  7.2(c).  The  single-sensor  (P  =  1)  MMSE  results  are  also  shown  in  this 
figure.  This  figure  illustrates  the  distinct  advantage  of  centralized  fusion  for  JD/E. 

7.8. 1.3  Multiple  Pulse  Processing 

The  results  of  processing  two  pulses  are  given  in  Figure  7.2(d).  The  EKF  and 
EHOF  are  configured  such  that  the  initial  error  covariance  is  reset  at  the  beginning 
of  each  pulse.  The  rationale  for  this,  as  discussed  in  Chapter  4,  is  to  re-excite 
the  system.  This  helps  to  allow  poor  estimates  to  possibly  converge  to  smaller 
errors,  and  as  shown  in  Chapter  4  it  does  not  significantly  effect  those  estimates 
that  have  already  converged  close  to  the  actual  state  value.  Figure  7.1(d)  shows  an 
improvement  of  about  3  dB  for  the  two  pulse  estimate  over  the  single  pulse  estimate. 
This  improvement  is  supported  by  (7.31). 

7.8.2  Doppler  Estimation 

Both  single  and  double  sensor  models  (P  =  1,  and  P  =  2)  in  (7.7)  were 
selected  for  experimental  evaluation  for  Doppler  shift  estimation.  For  this  evalua- 
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tion,  the  pulse  width  was  set  to  24  ta  and,  c,  the  speed  of  propagation,  was  186000 
miles/sec.  For  all  tests,  the  nominal  time  delay  and  Doppler  were  rnom  =  0.000324 
and  iy nom  —  1)  =  8.96  x  10-7  respectively,  corresponding  to  a  target  at  a  nominal 
range  of  10  miles,  traveling  at  300  mph  Doppler  velocity.  The  sampling  frequency 
was  set  to  /s  =  4  x  103  Hz.  The  error  in  the  Doppler  estimate  was  assumed  to 
be  uniformly  distributed  at  ±7.47  x  10“7  about  the  nominal  Doppler.  This  corre¬ 
sponds  to  an  error  in  the  Doppler  velocity  of  ±250  mph.  The  corresponding  variance 
is  1.85  x  10~13.  It  is  observed  from  (7.32)  that  the  Doppler  error  variance  can  be 
decreased  by  increasing  the  pulse  width  or  by  increasing  the  carrier  frequency  uc. 
This  is  the  reason  for  the  large  pulse  width  (6  msec)  for  Doppler  estimation  versus 
the  relatively  small  pulse  width  (0.12/isec)  used  for  time  delay  estimation. 

7.8.2. 1  Single  Sensor  Evaluation 

For  Doppler-only  estimation  the  measurement  model  (7.14)  becomes  = 
<7tp(Tnom,  Vk)  +  f°r  ^nom  <  kt,  <  fnom  + with  x0|0  9j.  =  vmQ,  and  /o|o,0,  = 
Var(i/mo).  Thus,  for  the  JD/E  technique,  f*  is  held  constant  at  its  initial  estimate 
’'nom-  For  the  single  sensor  case  the  carrier  and  translation  frequencies  were  u>c  = 
2i r  *  100  x  106,  and  =  2ir  *  99.9975  x  106. 

The  single  sensor  model  was  used  to  compare  the  use  of  seven  Doppler  filters 
(M  =  7)  to  a  single  filter  (Af  =  1)  for  joint  detection/estimation.  With  only 
one  filter,  ioio.tfj  =  Tnom,  -Pojo.aj  =  (2^m«r)2/12,  where  umax  is  the  maximum 
initial  Doppler  shift  excursion  due  to  the  maximum  Doppler  velocity  of  250  mph. 
Umax  =  (2  *  250)/(3600  *  c)  =  7.46  x  10-7.  The  initial  estimates  of  Doppler  for 
the  multiple  filter  implementation  are  given  by  vmo  =  (m  —  4)  *  A v  +  t'nom,  m  = 
1,2,  •••  7,  where  Ai/  =  2vmax/7.  The  initial  variance  for  each  of  the  7  filters  is 
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Var(i/mo)  =  (Az/)2/12,  Vm.  The  a  priori  probabilities  are  given  by  P(0m  |Zo)  = 
l/M,  1  <  m  <  N.  The  Monte  Carlo  simulation  results  for  JD/E  for  M  —  1  and 
M  =  7  are  shown  in  Figure  7.4(a).  In  this  figure  the  mean  square  error  (MSE) 
of  the  estimation  error  in  v t  is  shown  as  a  function  of  SNR.  Both  the  MAP  and 
MMSE  estimates  are  shown  in  Figure  7.4(a).  The  MAP  and  MMSE  estimates  are 
the  same  for  M  =  1.  AI30  shown  on  this  graph  are  the  results  for  the  detection-only 
(D-0)  technique.  The  JD/E  (M  =  7)  results  are  essentially  the  same  as  the  JD/E 
(Af  =  1)  results.  Thus,  in  this  case  there  is  no  advantage  in  using  more  than  one 
filter  to  estimate  the  Doppler  shift.  (This  is  in  contrast  to  the  time  delay  estimation 
results  shown  in  Figure  7.2(a),  in  which  the  JD/E  N  =  7  performance  was  much 
better  than  that  for  JD/E  N  =  1.)  The  JD/E  (M  =  7)  implementation  gives  better 
results  them  the  D-0  method,  particularly  at  higher  SNR.  This  is  expected  since 
the  filter  in  the  JD/E  method  allows  a  considerable  refinement  estimates  at  higher 
SNR  as  compared  to  low  SNR  where  the  larger  noise  covariance  restricts  the  filter 
gain.  At  —5  dB  SNR  the  JD/E  and  D-0  implementations  perform  identically.  In 
general,  the  MMSE  estimates  are  better  than  the  MAP  estimates,  particularly  at 
low  SNR’s. 

The  JD/E  (M  =  7)  technique  is  evaluated  in  lognormal  noise  in  Figure  7.4(b) 
for  the  single  sensor  model.  The  MMSE  estimates  of  1/*  are  shown  in  this  figure 
for  the  EKF  and  for  the  EHOF.  The  EKF  is  evaluated  in  two  configurations.  In 
the  first  configuration  the  Gaussian  pdf  is  used  to  evaluate  the  detection  statistic 
given  by  equation  (7.9).  In  the  second  configuration  the  lognormal  pdf  is  used.  The 
EHOF  is  evaluated  using  the  lognormal  pdf  only. 
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Figure  7.4  JD/E  Performance  for  Doppler  Shift  Estimation 
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7.8.2. 2  Double  Sensor  Evaluation 

In  the  multiple  sensor  case  [P  >  1)  the  sensors  may  have  different  carrier 
frequencies  (u ;Cp),  and  different  translation  frequencies  {oJtp)-  A  two-sensor  ( P  =  2) 
model  was  evaluated  in  which  u>Cl  =  2ir  *  100  x  106,  utl  =  2ir  *  99.9975  x  106, 
u>c2  =  2ir  *  200  x  106,  and  u>t2  =  2ir  *  199.9975  x  106.  The  MMSE  results  of  this 
evaluation  for  JD/E  (M  =  7)  are  given  in  Figure  7.4(c).  The  single-sensor  (P  =  1) 
MMSE  results  are  also  shown  in  this  figure.  This  figure  illustrates  the  distinct 
advantage  of  centralized  fusion  for  JD/E. 

7.8. 2.3  Multiple  Pulse  Processing 

The  double  pulse  model  is  compared  to  the  single  pulse  model  in  Figure 
7.4(d).  Recall  from  (7.36)  that  the  variance  in  frequency  (and  Doppler)  estimates  is 
a  function  of  the  inverse  pulse  width  squared.  However,  processing  two  pulses  does 
not  give  the  same  advantage  as  processing  an  equivalent  pulse  of  size  2tw.  As  shown 
in  Figure  7.4(d)  the  advantage  is  approximately  3  dB  -  the  same  as  for  time  delay 
estimation  (Figure  7.2(d)). 

7.0  Conclusion 

The  space-time  modeling  of  the  signal  returns  as  described  in  (7.14)  has  been 
used  in  conjunction  with  nonlinear  filters  to  design  a  new  adaptive  sensor  processor. 
Simulation  results  show  excellent  detection  capabilities  and  excellent  resolution  in 
target  parameter  estimation  for  both  single  and  multiple  sensor  data.  With  the 
excellent  detectability,  fine  parameter  resolution,  and  automatic  data  referencing, 
this  approach  presents  a  very  competitive  design  for  target  detection  and  parameter 
estimation. 


The  most  significant  result  from  the  implementation  of  the  JD/E  technique 
for  time  delay  estimation  is  that  the  requirement  that  the  initial  estimation  error 
At  <  1/(2 fc)  can  be  relaxed  by  implementing  several  parallel  filters. 
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Chapter  8 

Multisensor  Detection  and  Signal  Parameter  Estimation 

This  chapter  addresses  the  problem  of  multi  sensor  detection  and  high  reso¬ 
lution  signal  parameter  estimation  using  joint  maximum  a  posteriori  detection  and 
high  order  nonlinear  filtering  techniques.  The  specific  problem  addressed  is  that 
of  two  spatially  separated  sensors  that  employ  active  echo  processing  to  estimate 
the  parameters  of  a  target.  The  geometric  area  of  coverage  of  the  two  sensors  is 
permitted  to  overlap.  In  the  overlap  region  the  estimates  from  the  two  sensors  are 
combined  to  produce  improved  estimates  over  the  single  sensor  estimates. 

The  problem  is  approached  using  joint  detection/estimation  techniques.  Sev¬ 
eral  hypotheses  are  postulated  for  detection.  Each  hypothesis  corresponds  to  the 
ability  of  each  sensor  to  detect  the  target  in  its  area  of  coverage.  The  a  priori  prob¬ 
abilities  of  each  decision  are  based  on  the  area  of  coverage  of  the  two  sensors.  For 
each  hypothesis,  a  high  order  filter  recursively  estimates  time  delay,  Doppler  shift 
and  geometric  angle  to  the  target  from  processing  the  returns  of  the  transmitted 
signal  from  each  sensor.  These  estimates  are  in  turn  used  to  estimate  target  position 
and  velocity.  For  each  of  these  hypotheses,  another  set  of  parallel  filters  is  used  to 
obtain  more  accurate  estimates  of  signal  parameters  and  to  account  for  the  stability 
problems  that  result  from  the  first  order  Taylor  series  expansion  used  in  the  nonlin¬ 
ear  filtering  algorithms.  This  is  accomplished  by  operating  a  separate  filter  for  each 
of  several  different  initial  time  delay  estimates  of  the  return  signal.  The  maximum 
likelihood  estimate  for  a  given  hypothesis  is  then  determined  as  a  weighted  sum  of 
the  estimates  from  each  of  the  local  hypotheses,  with  the  a  posteriori  probability 
being  used  as  the  weighting  function.  It  is  assumed  that  the  signals  are  imbedded 
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in  Gaussian  noise,  and  clutter.  The  clutter  is  treated  as  non-Gaussian  noise  with  a 
lognormal  or  Weibull  distribution. 

Consider  the  situation  of  two  spatially  separated  sensors,  si  and  s2.  Each  of 
the  two  sensors  attempts  to  detect  and  track  objects  coming  into  its  respective  area 
of  coverage.  For  a  valid  data  fusion  scenario,  the  coverage  of  the  two  sensors  is  as¬ 
sumed  to  overlap  in  space,  but  not  entirely.  The  sensor  geometry  is  shown  in  Figure 
8.1.  In  the  overlap  region  the  data  received  by  the  two  sensors  can  be  combined  to 
get  a  more  accurate  estimate  of  target  parameters  or  to  estimate  parameters  that 
cannot  be  estimated  with  one  sensor  alone.  In  the  overlap  region  the  estimates  from 
the  individual  sensors  are  combined  to  form  improved  target  parameter  estimates. 
We  consider  the  case  where  each  of  the  sensors  may  have  different  types  of  tracking 
devices  such  as  optical  trackers,  various  types  of  radars,  etc.  It  is  assumed  that 
these  sensors  transmit  a  signal  and  process  the  echo  returned  from  that  signal.  It 
is  assumed  that  the  signals  are  corrupted  by  additive  Gaussian  noise  due  to  ther¬ 
mal  effects  within  the  receiver,  and  by  clutter  which  may  be  due  to  non-Gaussian 
distortion  such  as  sea  clutter  or  other  multipath  spreading.  The  amplitude  of  sea 
clutter  is  characterized  by  statistical  fluctuations  which  may  be  described  in  terms 
of  a  probability  density  function.  Typical  distributions  used  to  model  this  distor¬ 
tion  include  the  Rayleigh,  Weibull  or  lognormal  distributions  [81,  pp.  478-479].  In 
general,  the  Gaussian  noise  introduced  into  the  receiver  is  uncorrelated  between  the 
two  sensors. 

8.1  System  Model 

Assume  that  each  sensor  consists  of  a  phased  array  or  other  sensing  device 
that  can  produce  target  angle  estimates  along  with  estimates  of  time  delay  and 


Figure  8.1  Sensor/Target  Geometry  for  Multisensor  Fusion 
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Doppler  shift.  It  is  assumed  that  there  are  two  separate  measurements  taken  at 
each  sensor  -  one  measurement  at  each  of  the  offset  phase  centers.  The  received 
signal  at  the  ith  sensor  may  be  described  by 

x'k  =  h*i  +  u‘t  +  v‘k  (8.1) 


where  represents  the  received  signal,  u,fc  is  the  clutter,  and  v,t  is  the  Gaussian 
noise  at  the  kth  sampling  interval.  Since  there  are  two  measurements,  the  received 
signal  can  be  more  explicitly  stated  as 


— 

h'h 

+ 

W 

+ 

VHk 

-Z*k- 

Mk. 

■U*2  k- 

y*k- 

(8.2) 


The  received  signal  vector  h ik  at  sensor  i  can  be  described  by  the  following  model. 


Mk. 


a(kta  -  Til  +  Ti2/2)p(kt s  -  TV!  +  Ta/ 2)sin( Ui(u)i (kts  -  r;i  +  Ti2/2))) ' 

a(kts  -  th  -  Ti2/2)p(kt9  -  th  -  Tt2/2)sin(i/,(u>,(fcfj  -  th  -  m/2)))m 

(8.3) 


where  a(.)  is  the  amplitude,  p(.)  is  the  pulse  shaping  function,  and  t,  is  the 
sample  time  (ts  =  1/ /*).  The  delay  Tji  is  the  round-trip  propagation  time  from  the 
center  of  the  sensor  to  the  target  and  back  to  the  sensor.  Referring  to  Figure  8.2, 
this  is  the  time  for  the  signed  to  travel  from  point  Pi  to  O  and  back  to  point  P,. 
From  Tii  the  range  to  the  target  can  be  determined  using  the  relationship 

=  £  (8.4) 

where  c  is  the  speed  of  propagation.  The  delay  r,2  is  the  difference  in  time  for 
the  signal  to  reach  from  point  Pn  to  point  Pi2.  The  difference  in  the  propagation 
distance  is  given  by  cr,2.  The  differential  angle  A  &  to  the  target  from  sensor  i, 
which  represents  the  difference  between  the  sensor  pointing  angle  <f>i0  and  the  actual 
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target  angle  fa,  is  then 

A*  =  sin-1  (^) 
fa  =  faQ  +  A  fa 

where  D,  is  the  distance  between  the  two  offset  phase  centers  in  the  phased  array  for 
sensor  i.  The  initial  estimates  of  r,2  are  based  on  the  geometric  relationship  shown 
in  Figure  8.2.  This  figure  shows  that  the  geometric  angle  Afa  is  given  by 

sin(Afa)  =  7^  =  ^  (8-6) 

The  function  p(.)  in  (8.3)  represents  the  pulse  shaping  function  and  is  gen¬ 
erally  designed  to  limit  the  signal  bandwidth  at  the  expense  of  widening  the  main 
lobe  of  the  function  in  the  frequency  domain.  Several  possible  pulse  shapes  and 
their  spectral  characteristics  are  given  by  Harris  [82].  It  is  assumed  that  the  signal 
is  attenuated  by  spherical  spreading  loss  such  that  the  received  amplitude  a(.)  is 
related  to  the  transmitted  amplitude  A  through  the  relation 

a  a 

a(kt,  -  (r„  ±  W2»  =  (— t  ±  W2))2  (8.7) 

For  constant  receiver  noise  power  Og,  the  signal  to  white  noise  ratio  at  the  receiver 
is  given  by 

Era(kt,  -  (n,  ±  ,a/2))2  _  8 E,A* 

2a]  c*(t„  ±  rl2/2)V| 

where  Ep  is  the  average  pulse  energy  per  sample.  Given  that  the  transmitted  ampli¬ 
tude  A  and  the  carrier  frequency  u are  known,  then  the  unknown  delays  th  and  r,2, 
and  the  Doppler  shift  parameter  v,-  must  be  estimated.  The  Doppler  velocity  V^, 
which  is  the  projection  of  the  target  velocity  along  the  line  of  sight  from  sensor  i  to 
the  target  and  is  given  by  Vj.  =  |V|cos(7i),  where  V  is  the  target  velocity  vector. 


Figure  8.2 
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Vj.  is  related  to  U{  through  the  relation 


(8.9) 


The  estimation  accuracy  and  the  number  of  target  motion  parameters  that 
can  be  estimated  are  a  function  of  the  position  of  the  target  within  the  coverage  area 
of  each  sensor.  If  the  target  is  located  in  a  region  covered  only  by  a  single  sensor 
then  estimates  of  the  Doppler  shift  and  the  two  time  delays  from  this  single  sensor 
can  be  used  to  estimate  only  the  target  position  and  Doppler  velocity  Vj.  for  that 
sensor.  If  the  target  is  in  the  overlap  region  then  estimates  of  the  Doppler  shift  and 
the  two  delays  from  each  sensor  can  be  used  to  obtain  a  more  accurate  estimate  of 
target  position  in  addition  to  estimating  the  complete  target  velocity  vector.  The 
models  for  these  two  situations  are  developed  next. 

8.1.1  Single  Observer  Model 

Using  estimates  of  tji,  t,2  and  t /,•  from  one  sensor  the  target  position  and 
Doppler  velocity  can  be  estimated  through  the  relations  (8.4,  8.5,  and  8.9).  Define 
the  state  variable  vector  for  sensor  i  as 

T 

Xik  =  [r,it  mk  Vik\ 

It  is  assumed  that  the  state  does  not  change  while  the  pulse  is  being  reflected  from 
it.  Therefore  the  process  equation  is  not  necessary;  that  is,  the  state  transition 
matrix  is  unity  and  there  is  no  process  noise.  In  terms  of  the  state  variables  the 
received  signal  at  transmitter  i  is 


aHk(xik)Piik(xik)rnk(xik) 

hi*  — 

<H2k  (Xik  )pi2k  (Xi*  )r,2*  (Xi*  ) 


(8.10) 
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where 


,  v _ ** _ 

“,lt(  ’*J  (c( xi,(l)  +  *it(2)/2))> 


“i2‘(Xi‘)  (c(ijt(l)  -  xil(2)/2))2 

Pii*(x,*)  =  0.5  *  (1  -  cos(2jrxtJk(3)(*f4  -  Xik(  1)  +  xik(2)/2)/tWi))  ^ 

Pi2k(xik)  =  0.5  *  (1  -  cos(25rxiit(3)(^4  -  xIJk(l)  -  x,-fc(2)/2 )/<«,,)) 

r,lfc  =  cos(x,fc(3)(w,(fc<4  -  xlifc(l)  +  xfJfc(2)/2))) 

r,-2jfc  =  cos(xjJt(3)(wj(fc<4  -xfJfc(l)  -  Xik(2)/2))) 

The  definition  of  p»i.  (x»t)  given  above  represents  the  Hanning  pulse  type  with  pulse 
width  tWi.  The  filter  equations  require  the  derivative  of  the  signal  model  with  respect 
to  the  state.  This  derivative  is  given  by 


dailk  9pnk  driik 

dhik  _  -5x+PiikrHk  +  «.l* S^rih  + 

dXi.  ~  dai2k  .  ,  dri2k 

k  l-5**-PiHrak  +  a>2kl£*-rt2k  +  ai2kPi2k-£* 


(8-12) 
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where 

^aij/c  — 8  A 

d*.*(l)  ~  c2  ((*.*(1)  “  W*(2)/2))3 

daijk  Kj4A 

dxik( 2)  ”  c2  ((xIJfc(l)  -  Kjxik( 2)/2))3 

aj~(i)  =  ~(7r:r'jk(3)/<«'.)sin(2irz't(3)(^* -  *•*(!)  +  *iz»fc(2)/2)Awl) 

A 

5x'™'(2)  =  ^■^,ci7rz,i(3)/^,£’»8*I1(2irz,i(3)(^4  —  z*t(^)  *b  *iZ*fc(2)/2)/^,<’i) 

A 

=  (*(**»  -  ^fc(l)  +  *j xik(2)/2)/tWi)  (8.13) 

sin(27rxIifc(3)(A:^  -  xIJt(l)  +  KjXik(2)/2)/iWi) 

=  +*i*(3)w«Mn(xiJk(3)(wb-(W,  -  xIJt(l)  +  «>xIJt(2)/2))) 

=  -0.5/cJXijk(3)a;lsin(xijt(3)(u;l(A:<,  -  xijt(l)  +  /cixIJt(2)/2))) 

=  -<*(**«  -  Xik(  1)  +  njXik(2)/2) 

sm(xik(3)(ui(kta  -  xik(  1)  +  Kjiik(2)/2))) 
for  j  =  1,2.  Ky  =  +1  whenever  j  =  1.  Kj  =  —1  whenever  j  —  2. 


It  is  assumed  that  the  error  in  the  initial  conditions  is  not  correlated  with  the 
measurement  noise  and  that  the  Gaussian  noise  is  not  correlated  with  the  clutter. 
Thus  the  measurement  noise  covariance  is  given  by 


=  EKuik  +  v»fc)(u*fc  +  V«*)T] 


=  E[\iikufk}  +  E[vikvik)T) 


=  oi 


1  Pi 
L  Pi  i 


+  < 


i  o 
o  1 


(8.14) 
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where  it  is  assumed  that  the  Gaussian  measurement  noise  and  the  clutter  are  un- 
correlated,  and  that  the  clutter  has  correlation  coefficient  pi  between  the  two  offset 
phase  center  beams.  The  third  moment  of  the  measurement  noise  is 

=  E[(*ik  ®  u**)u?fc]  (8-15) 

which  contains  components  only  due  to  the  clutter  since  the  Gaussian  noise  has  a 
symmetrical  density  function.  The  fourth  moment  of  the  measurement  noise  is 

R\^  =  E[(uik  ®  u^Xu^  ®  u ik)T]  +  £[(vtjfc  ® vtt)(vijfc  ®  vikf]  (8.16) 
8.1.2  Double  Observer  Model 

When  information  is  available  from  two  sensors,  that  is,  whenever  the  target 
is  in  the  overlap  region,  and  the  target  is  illuminated  simultaneously  by  the  two 
radars,  the  Doppler  and  time  delay  estimates  from  each  sensor  can  be  combined  to 
obtain  a  better  estimate  of  target  position  and  velocity. 

Let  X'  and  Y'  denote  the  directions  of  a  local  coordinate  system  as  shown  in 
the  insert  in  Figure  8.1.  Let  <f>iQ  and  <faQ,  the  pointing  angles  of  the  two  sensors,  be 
chosen  such  that  <f> 20  —  <f> i0  =  90  deg.  In  this  case  the  direction  X'  points  directly 
along  the  line  of  sight  (LOS)  of  $2,  and  perpendicular  to  the  LOS  of  aj.  Likewise, 
Y*  points  directly  along  the  LOS  of  $i  and  perpendicular  to  the  LOS  of  S2.  X'  is 
the  in-track  direction  for  si  and  the  cross-track  direction  for  S2.  Y'  is  it  in-track 
direction  for  52  and  the  cross-track  direction  for  si.  For  small  angles  A<f>i  such  that 
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sin(A<^,-  w  0),  the  position  estimates  in  the  X',Y'  coordinate  system  are  given  by 

Oxi  =  — (cf2i/2  -  R2q) 

=  RiqCtu/Di 

(8.1.7) 

Oyl  =  Cfll/2  -  i?iQ 


=  R2qc^22  /  D2 

where  RiQ  is  the  nominal  range  from  sensor  t  to  the  center  of  the  insert  in  Figure 
8.1.  The  associated  position  error  variances  are  given  by 

=  RV^\m]ID\ 

a2,  =  c2Var[T2i]/4 

2  .  (8.18) 
=  c2Var[rn]/4 

a\  = 


If  it  is  assumed  that  the  time  delay  estimation  errors  have  Gaussian  distributions 
then  the  maximum  likelihood  estimate  of  the  target  position  in  the  overlap  region 


R2  is  given  by 


all  Ri0cti2/D\  -  a2,  (cf2i/2  -  R^) 


a2,  a2, 
*l  x2 


(8.19) 


FVoxn  Figure  8.1  it  is  seen  that  the  target  position  (O*,  09)  can  be  found 
from  the  time  delays  at  either  sensor.  The  position  coordinates  are  determined  from 

Ox  =  Aicos(^j) 

(8.21) 

A  A 

=  i2i2cos(o)  +  R2cos(<f>2) 
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d9  =  Aisin(^i) 


(8.22) 

A  A 

=  i2i2sin(a)  +  R2sin(fa) 

A  A 

where  /?,•  and  fa  are  obtained  from  (8.4,  8.5).  Define  the  position  error  variances 

a\x  =  Var[i?icos(^i)] 

o\2  =  Var[/?i2Cos(o;)  +  R2Coa(fa)) 

(8.23) 

-  Var[^isin(^i)] 

<rj2  =  Var[i?i2sin(a)  +  A2sin(^2)] 

For  small  angles  A  fa  these  error  variances  can  be  expressed  in  terms  of  the  time 
delay  variances  through  the  use  of  (8.5)  and  (8.4).  This  yields 

a\x  =  ^i0sin2(^l0)(c/Di)2Var[ri2]  +  (c/2)2cos2^i0Var[rn] 

°i2  =  t%0s™2(<ho)(c/Di)2Vai[T22]  +  (c/2)2cos2^20Var[T2i] 

(8.24) 


ah  =  jRi0cos2(^io)(c/-D»)2Var[T12]  +  (c/2)2sin2^i0Var[r1i] 
aV2  ~  %0<*»2(<h0)(cIDi)2Vax[Ti2)  +  (c/2)2sin2^20Var[72i] 

In  the  overlap  region  the  estimates  can  be  combined  to  form  the  weighted  estimate 


^  _  ^2(flicos(^i))  +  <722(fii2COs(a)  +  A2cos(^2)) 


(8.25) 


(Ris\u(fa))  +  ^(ifaginfo)  +  fl2sin(fo)) 


o\  <r2 

9\  92 


(8.26) 


A 

The  Doppler  velocity  estimate  V  and  Doppler  angle  estimates  71  and  72  can 

A  A 

also  be  estimated  in  the  overlap  region.  With  the  estimates  fa  and  fa  in  hand,  the 


A 

angle  (  is  found  from 
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A  A  A 

(  =  <t>2-  4>1 

Using  the  Doppler  velocity  equation 


Vdi  =  |V|cos(7,),  (8.27) 

and  71  =  72  +  (,  the  ratio  of  Doppler  velocities  gives 

Vil  _  cos(72  +  C) 

Vd2  cos(72) 

Solving  for  72 

72  =  tan-1 

With  the  estimate  72,  the  magnitude  of  the  Doppler  velocity  |V|  can  be  found  from 
(8.27),  and  the  target  heading  is  ^  =  —72  -f  fa  —  r. 

8.2  Joint  Detection/Estimation 

The  target  search  region  has  been  localized  to  the  rectangular  box  shown  in 
Figure  8.1.  This  box  is  subdivided  into  several  resolution  cells  as  shown  in  this  figure. 
The  beam  pattern  from  sensor  s\  allows  this  sensor  to  detect  a  target  and  estimate 
its  parameters  if  the  target  is  located  in  resolution  cells  1  through  21.  Sensor  S2 
can  detect  the  target  if  it  is  in  cells  11  through  15,  22  through  25,  or  26  through 
31.  If  the  target  is  not  located  in  any  of  these  cells  then  the  target  is  declared 
not  present  (or  more  precisely,  not  detectable)  .  This  situation  is  represented  by 
the  null  hypothesis  Hq.  The  resolution  cells  are  grouped  into  regions  which  will 
be  used  for  minimum  mean  square  error  estimation.  If  the  target  is  located  in 
regions  R\  (resolution  cells  1  through  9)  or  R3  (resolution  cells  16  through  21)  only 


cos(Q-VdJVd2 

sin(C) 
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sensor  si  can  detect  the  target.  Regions  R4  (resolution  cells  22  through  25)  and  R$ 
(resolution  cells  26  through  31)  correspond  to  the  coverage  area  of  sensor  S2  only.  If 
the  target  is  located  in  region  R2  (resolution  cells  10  through  15)  both  sensors  can 
detect  the  target  and  perform  parameter  estimation.  In  this  case  the  estimates  can 
be  combined  as  described  in  section  8.1.2.  The  remaining  area  in  the  rectangle  in 
Figure  8.1  is  designated  as  region  Ro,  where  neither  sensor  can  detect  the  target. 

The  joint  detection/estimation  (JD/E)  procedure  applied  to  this  problem 
involves  both  model  uncertainty  and  uncertain  initial  conditions.  JD/E  for  this 
situation  is  discussed  in  Chapter  5.  Let  0,  €  0  designate  the  parameter  vector 
that  describes  the  different  combination  of  model  uncertainty  and  initial  condition 
uncertainty.  The  parameter  vector  0,  is  assumed  to  be  time  invariant.  The  param¬ 
eter  vector  6j,  j  =  (n  —  1)  *  M  +  m,  1  <  n  <  N,  1  <  m  <  M,  is  defined  to  be 
the  (n,  m)th  delay /Doppler  resolution  cell  and  is  used  to  define  NM  +  1  different 
combinations  initial  conditions  and  models,  n  corresponds  to  the  range  resolution 
cell  number  determined  from  the  initial  conditions  on  the  two  time  delays  from  each 
sensor,  and  m  corresponds  to  the  velocity  resolution  cell  number  determined  from 
the  initial  conditions  on  the  Doppler  shift  from  each  sensor.  Since  the  model  uncer¬ 
tainty  is  associated  with  the  spatial  coverage  of  each  sensor,  Doppler  shift  estimation 
is  not  considered  in  the  experimental  evaluation.  The  signal  carrier  frequency  u>c 
and  pulse  width  tw  will  be  assigned  values  that  lead  to  good  delay  estimation,  but 
poor  Doppler  estimation.  This  is  done  so  that  large  Doppler  uncertainty  has  limited 
effect  on  the  model  used  in  each  spatial  resolution  cell,  and  the  focus  can  be  directed 
primarily  on  range  and  azimuth  resolution. 

The  number  of  Doppler  resolution  cells  is  set  to  M  =  1  in  order  to  simplify 
the  discussion  to  follow.  For  the  case  M  >  1  the  initial  conditions  on  Doppler  shift 
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would  be  different  for  each  hypothesis.  This  situation  is  discussed  in  Chapter  7.  For 
M  =  1  the  parameter  6j  is  directly  associated  with  the  range  resolution  cell  number. 
Hypothesis  //$.,  redesignated  Hj,  corresponds  to  the  hypothesis  that  the  target  is 
present  in  range  resolution  cell  j. 

In  region  Rq,  from  which  neither  sensor  can  detect  the  target,  hypothesis  Ho 
is  defined  by 

Ho  :  Zik  =  Uik+Vik  Vfc,  *  =  1,2  (8.28) 


For  those  resolution  cells  in  regions  R\  and  R$,  the  hypothesis  corresponding 
to  cell  jf,  j  =  1,2,***  ,9  (Region  Ri),j  =  16, 17, •••,21  (Region  R3)  is  given  by 


Hul 


kt,<  firo. 


Ulmk+Vlmk  ---  „..Jk 

hlmk  "h  d"  T*mik  —  ^^a  <  ^mjk  d*  ^v>\ 

U1  mk  d-  -«  _  +—Jk 


kia  >  Tl  m,-,  d"  ttcj 


(8.29) 


l  Z2t  =U2td-V2jfc  Vfc 


for  m  =  1,2.  The  delay  f^.  is  given  by 


Timjk  ~  f,1>*  +  Kmiiijk 


(8.30) 


where  Km  =  +1  whenever  m  =  1,  and  nm  =  —1  whenever  m  =  2.  The  initial 
conditions  are  given  by 

iT 

(8.31) 


ilo|o,^  =  \hiv*nh,hj 

ph\0,9j  *  Dia«  [Var(niio]’ Var [^1 2io ] ,  Vax [*>! 0 ] ] 


The  initial  estimates  ,*  =  1,2  are  chosen  such  that  the  position  of  the 

target  for  a  signal  received  ate  sensor  i  is  at  the  center  of  resolution  cell  j.  The 
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variances  Var[f,i .  ]  and  Var[f,2.  ]  are  determined  based  on  a  uniform  distribution 
jo  Jo 

of  the  error  within  the  cell. 


In  region  R2  both  signals  are  assumed  to  be  present.  In  this  region  the 
hypothesis  associated  with  cell  j,  j  =  12, 13,  *  •  -  ,15  is  given  by 


Hj:{ 


Zlmk  ~  ' 


^lmk  d"  vlmk  kta  <  T\ itij^ 

mk  +  +  uln»£  <  kta  <.  Tim j  +  tu»j 


d"  vlmk 


kit  >  hmik  +  t 


‘wj 


z2  mi  =  < 


kt$  <  T2mjk 


^2ibj  d" 

mk  d"  u2mk  d-  V2mk  r^mjk  ^  kta  <  T2mjk  d-  tu>2 

u2mk  d"  v2mk  kta  >  ^mjk  ^w2 

for  m  =  1,2.  The  initial  conditions  are  given  by 

*'010,*,  =  Kb  ’  Ko  ’ 

*o|M,  =  Diag  [VarKo J’  ^Ko1’  Var[i><o]] 

*  =  1,2 


(8.32) 


(8.33) 


For  those  resolution  cells  in  regions  R4  and  R5,  where  only  sensor  2  can  detect 
the  target,  the  hypothesis  corresponding  to  cell  j,  j  =  22, 23,  •  ■  ■ ,  25(Region  R+),j  = 
26, 27,  •  •  • ,  31  (Region  R$)  is  given  by 

I  *i,  =  ui4  +  v,t  Vfc 


< 


z2mk  - 


«2 mk  +  V2mk  kta  <  T2mjk 

^2 mk  d"  u2mk  d-  v2mk  ^mjk  -  ^  ^w2 

u2mk  d"  v2mk  kta  >  *2mjk  ^*°2 


(8.34) 
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for  m  =  1,2.  The  initial  conditions  are  given  by 

i2oio,^ = 

%o ,9j  =  DiaS  [Varfaiio],  Var[f22io],  Var[i>20]] 


(8.35) 


A  maximum  a  posteriori  detection  criterion  can  be  used  to  determine  the 
most  likely  range  resolution  cell.  This  criterion  requires  the  availability  of  a priori 
probabilities  of  each  hypothesis,  and  it  requires  the  probability  density  functions  for 
the  measurements.  Define  Z*  =  [zi,Z2,  •••  z*],  where  z *  =  [zf^z^J7^  as  the  set 
of  all  measurements  up  to  time  k,  and  let  p(z*|Z*_i,0j)  be  the  probability  density 
function  of  z*  given  the  measurements  Z*_i  and  hypothesis  Hj.  The  a  posteriori 
probability  of  hypothesis  Hi  is  given  by 


pra  . \<j  \ _ (z^) 

(  ^  E^0^m|Z*_1)Am(zt) 


where  Aj(zjt)  is  the  likelihood  ratio  defined  by 


-  Wk tS) 


(8.36) 


(8.37) 


In  general  the  distribution  function  p(z*|Zfc_i,0i)  is  non-Gaussian.  Since  the  mea¬ 
surement  noise  consists  of  a  sum  of  Gaussian  noise  and  non-Gaussian  clutter,  the 
joint  density  function  consists  of  a  convolution  of  the  Gaussian  and  non-Gaussian 
density  functions.  In  general  it  is  not  possible  to  compute  this  joint  density  analyti¬ 
cally  and  must  be  done  numerically  for  each  iteration  of  the  filter,  since  the  density 
function  changes  as  the  estimate  x*|*  j  changes. 


Maximum  a  posteriori  (MAP)  detection  can  be  used  to  decide  the  most  likely 
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hypothesis  according  to  : 

Choose  Hj  :  9,  =  argmaxeme0  P(6m\Zk)  (8.38) 

The  MAP  estimate  from  sensor  s,-  is  then  the  estimate  associated  with  cell  j  if  cell 
j  is  in  the  spatial  area  of  coverage  of  that  sensor. 

The  minimum  mean  square  error  estimate  can  be  found  be  combining  the 
estimates  from  all  of  the  cells  with  a  particular  region.  If  the  state  vector  x*  is 
common  to  all  models  the  minimum  mean  squared  error  (MMSE)  estimate  can  be 
used.  The  MMSE  estimate  for  sensor  t  in  region  Rp  can  be  expressed  by 

*?„,=  £  (8-39) 

cellj€/i|>  3 

The  most  likely  region  is  selected  using  the  MAP  criterion.  Define  as  the  hypothesis 
that  the  target  is  located  in  region  Rp  as  Ip,  p  =  0,1,  •  •  • ,  5.  The  a  posteriori 
probability  associated  with  region  Rp  is  the  sum  of  the  a  posteriori  probabilities  of 
all  of  the  cells  in  that  region.  This  region-level  probability  is  given  by 

P(h\lk)=  £  P(», l|Zt)  (8.40) 

cell;€fy 

The  most  likely  region  is  chosen  such  that 

Choose  Ip  :  p  =  argmax^o.-.-.s,^©  (8.41) 

8.2.1  Definition  of  Priors 

The  a  priori  probabilities  of  each  hypothesis  are  based  on  the  area  coverage 
of  the  sensors.  The  total  number  of  resolution  cells  shown  in  Figure  8.1  is  56.  Of 
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these,  25  are  located  in  region  Ro.  All  cells  are  assumed  to  have  an  equal  probability 
containing  the  target.  The  a  priori  probabilities  are  given  by 


P(60)  =  25/56 

P(0j)  =  1/56,  i  =  1,2,  •••  31 


(8.42) 


The  probabilities  associated  with  regions  Rj,  j  =  0,1,  •  •  • ,  5  are  given  by 

P(Io)  =  25/56 
P(h)  =  9/56 
P(h)  =  6/56 

(8.43) 

P(h)  =  6/56 
P{IA)  =  4/56 
P(h)  =  6/56 


8.3  Simulation  Experiments 

An  experimental  study  was  conducted  to  evaluate  the  performance  of  the 
multisensor  fusion  technique.  In  this  evaluation  the  measurement  noise  consisted  of 
50%  Lognormal  Noise  and  50%  Gaussian  noise.  The  nominal  angles  from  sensors 
si  and  S2  to  the  target  were  <f>\Q  =  45  deg  and  <f> 20  =  135  deg,  respectively.  The 
nominal  range  from  si  to  the  target  was  10  miles.  The  nominal  range  from  sensor 
32  to  the  target  was  chosen  such  that  the  received  signal  at  32  was  5  dB  higher  than 
at  si  for  the  same  transmitted  signal  level  and  target  strength. 

The  carrier  frequencies  used  by  the  two  sensors  were  the  same  at  fc  =  lOx  106. 
This  is  not  practical  situation  since  two  cooperating  sensors  would  not  transmit  at 
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the  same  frequency,  unless  they  use  the  same  signal  generator  and  transmissions 
from  the  two  sensors  are  offset  in  time.  However,  it  is  desired  to  show  the  effect  of 
only  one  variable,  the  relative  SNR  at  each  sensor,  on  the  estimation  error.  Since  the 
operating  frequency  affects  the  variance  of  the  estimates  as  described  by  (7.31),  the 
operating  frequencies  axe  kept  the  same.  Both  sensors  illuminate  the  target  simulta¬ 
neously.  They  both  sample  the  signal  at  a  rate  fa  =  100  x  106,  and  both  signals  have 
the  same  pulse  width  tWi  =  12/ /,,  i  =  1,2.  The  resolution  cell  width  is  1//,  sec¬ 
onds.  The  associated  initial  error  variance  on  time  delays  t\\q  and  T2i0  is  t^/ 12.  The 
corresponding  range  resolution  cell  width  is  Arj  =  c/(2/,).  Thus,  the  initial  variance 
for  the  angle-measurement  delays  is  (8.6)  Var[Ti20]  =  ((Die)  /  (2  fsRi))2/ 12,  i  =  1,2. 
Di,  the  separation  between  phase  centers  at  the  sensor  was  chosen  to  be  3  feet  for 
each  sensor. 

The  carrier  frequencies,  pulse  widths,  and  sampling  frequencies  chosen  for 
this  evaluation  are  the  same  as  that  chosen  for  the  time  delay  estimation  experi¬ 
ment  in  Chapter  7.  It  was  observed  in  Chapter  7  that  the  values  chosen  for  these 
parameters  are  not  conducive  to  estimation  of  Doppler  shift.  Since  the  primary 
goal  of  the  evaluation  in  this  chapter  is  to  properly  locate  the  correct  region  and 
resolution  cell  number,  the  estimation  of  Doppler  shift  plays  a  secondary  role.  Ac¬ 
cordingly,  Doppler  shift  is  not  estimated  in  this  evaluation.  Target  positions  are 
selected  randomly  with  a  uniform  distribution  in  the  spatial  area  designated  by  the 
large  box  in  Figure  8.1. 

Simulations  were  performed  for  SNR’s  (at  sensor  si)  ranging  from  -lOdB  to 
lOdB.  500  random  target  positions  were  chosen  at  each  SNR.  Of  these  500  trials, 
228  target  positions  randomly  chosen  in  region  Ro,  91  in  Ri,  54  in  R2,  44  in  R$,  40 
in  R4,  and  40  in  R$. 


216 


Table  8.1  shows  the  detection  results  for  the  EKF  using  the  Gaussian  pdf 
to  evaluate  the  a  posteriori  density  function  in  (8.37).  The  average  a  posteriori 
probability  for  correct  decisions  at  the  region  level  is  given  by  P(Ip\Zt).  This  is 
computed  as  the  arithmetic  mean  of  the  a  posteriori  probabilities  (8.40)  for  those 
trials  in  which  the  correct  region  was  chosen  using  (8.41).  This  value  gives  may  be 
used  as  a  measure  of  the  level  of  confidence  that  the  proper  region  was  chosen.  At 
the  resolution  cell  level  the  average  probability  is  denoted  P(0j  |Z*).  A  target  was 
declared  present  if  the  a  posteriori  probability  P(Ip |Z*)  for  any  region  p,p  =  1,  •  •  ■ ,  5 
was  greater  than  P(J o|Z*).  The  probability  of  detection  is  labeled  P(Ip\Ig),  p,q  ^  0. 
This  quantity  was  determined  by  dividing  the  total  number  of  declared  detections, 
or  the  number  of  trials  in  which  a  target  was  declared  present  in  any  of  the  regions 
Ri  through  Rs,  by  the  total  number  of  trials  in  which  the  target  was  actually  located 
in  one  of  the  regions  R\  through  R$.  The  probability  of  false  alarm,  P(Ip\Io),  was 
determined  by  dividing  the  total  number  of  trials  in  which  a  target  was  declared 
present  when  it  was  actually  in  i2o,  by  the  number  of  trials  in  which  the  target  was 
actually  in  Ro. 

Table  8.1.  Multisensor  Fusion  Detection  and  False  Alarm  Probabilities 
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The  JD/E  technique  performs  very  well  in  terms  of  locating  the  proper  region. 
However,  the  performance  is  not  as  good  in  finding  the  correct  cell.  This  is  due  to  the 
fact  that  the  initial  variance  in  angle  delay  is  so  small  that  the  filter  cannot  properly 
decide  the  correct  cell  number.  The  probabilities  of  missed  detection  P(Io\Ip)  and 
correct  classification  (i.e.  not  only  detection  of  the  target  but  correct  localization  at 
the  region  level)  P(Ip\Ip)  ,  p  =  1,  ■  •  • ,  5  are  displayed  in  Table  8.2.  The  probability 
of  misclassification,  which  is  not  shown  in  this  table,  is  given  by  P(Iq\Ip)  =  1  — 
P(Ip\Ip) — P(Iq\Ip),  q^p.  Sensor  $2  outperforms  sensor  sj,  which  is  to  be  expected 
since  the  SNR  at  si  is  5  dB  higher  than  the  SNR  at  sensor  S2.  In  the  overlap  region, 
#2,  the  classification  performance  is  much  better  than  it  is  for  any  other  region, 
with  an  85%  probability  of  correct  classification. 

Table  8.2.  Probabilities  of  Missed  Detection  and  Correct  Classification  -  Region  Level 
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The  missed  detection  and  correct  classification  probabilities  at  the  cell  level 
are  shown  in  Table  8.3.  The  results  are  averaged  over  all  of  the  cells  in  each  region. 
Again  the  performance  for  those  cells  in  region  R2  was  much  better  than  for  any 
other  region.  The  classification  results  for  regions  Ri,  R$,  R4.  and  #5  were  poor 
even  at  high  SNR’s.  The  results  in  this  table  reflect  the  inability  of  the  sensors 
to  detect  the  proper  cell  number  in  the  cross-range  direction.  Figure  8.1  shows 
that  there  are  three  cells  in  the  cross  range  direction  for  sensor  si,  and  two  cells  for 
sensor  s 2.  Thus  assuming  that  the  cell  number  cannot  be  resolved  in  the  cross-range 
direction,  the  expected  cross-range  uncertainty  for  regions  R\  and  R3  is  1/3,  and  the 
cross-range  uncertainty  for  R4  and  R$  is  1/2.  This  is  verified  by  the  experimental 
results  in  Table  8.3  at  10  dB  SNR. 

Table  8.3.  Probabilities  of  Missed  Detection  and  Correct  Classification  -  Cell  Level 
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The  estimation  results  are  shown  in  Figure  8.3.  All  results  shown  in  this 
figure  are  in  reference  to  the  ( X',Y ')  coordinate  system.  Figure  8.3(a)  shows  the 
average  mean  squared  error  for  those  detections  in  regions  R\  and  R3,  in  which  only 
si  has  coverage.  The  results  in  this  figure  are  consistent  with  those  in  Table  8.3 
in  that  the  estimates  in  the  cross-track  direction  X'  never  improve  over  the  initial 
estimates  regardless  of  the  SNR.  Figure  8.3(c)  shows  similar  results  for  regions  R4 
and  i?s,  which  are  covered  by  sensor  $2.  Figure  8.3(c)  also  illustrates  the  5  dB 
performance  for  sensor  32  over  that  for  si.  Figure  8.3(b)  shows  the  results  for  both 
sensors  in  region  R2.  In  this  region,  as  shown  in  Table  8.3  the  proper  cell  is  almost 
always  found.  Thus  the  cross-range  estimation  error  variance  should  improve  by 
about  6  dB  (201og(2))  for  sensor  82,  since  the  cross-range  error  for  82  has  been 
localized  from  2  cells  down  to  1.  Similarly,  the  cross-range  error  variance  for  sensor 
si  in  Region  R2  is  reduced  by  about  10  dB  (201og(3))  since  the  target  has  been 
localized  from  3  cells  down  to  1.  This  improvement  is  evident  in  Figure  8.3(b). 
Figure  8.3(d)  shows  the  estimation  results  using  the  combined  measurents  obtained 
from  (8.19and  8.20).  Because  of  the  larger  variance  in  the  cross-range  error  for  each 
sensor  and  the  fact  that  the  intersection  of  the  LOS’s  between  the  two  sensors  are 
perpendicular,  the  combined  estimate  consists  of  the  X'  estimate  from  sensor  32  and 
the  Y'  estimate  from  sensor  si. 

The  mean  squared  errors  in  the  ( X ,  T)  coordinate  system  are  shown  in  Figure 
8.4.  Each  curve  in  this  figure  represents  the  combined  X  and  Y  position  errors, 
since  the  geometry  dictates  that  the  error  variance  should  be  the  same  in  each 
direction.  The  (X,Y)  positions  are  obtained  using  (8.21)  and  (8.22).  This  figure 
again  illustrates  the  approximate  5  dB  improvement  in  the  estimates  from  sensor  *2 
over  that  of  sensor  si  in  the  nonoverlapped  regions,  and  the  significant  performance 
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Figure  8.3  Multisensor  Fusion  In-Track  and  Cross-Track  Estimation  Errors 
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improvement  in  region  R2.  It  is  observed  that  the  (X,  Y)  position  errors  for  the 
overlap  region  shown  in  Figure  8.4  are  significantly  worse  than  those  shown  in  Figure 
8.3(d)  for  the  (A"',  Y')  coordinate  system.  This  is  particularly  evident  at  high  SNR’s. 
This  is  due  to  the  fact  that  the  cross-track  errors  are  included  in  the  computation 
of  the  combined  estimate  0*  and  Oy  determined  by  (8.21)  and  (8.22).  The  choice 
of  the  proper  coordinate  system  can  make  a  large  impact  on  the  performance  of  the 
estimator. 

8.4  Conclusion 

A  technique  has  been  presented  for  multisensor  fusion  based  on  joint  detec¬ 
tion/estimation  procedure.  It  is  shown  that  excellent  performance  can  be  obtained 
for  both  target  detection  and  target  parameter  estimation  using  this  technique.  A 
significant  advantage  of  this  technique  is  that  each  sensor  can  perform  detection  and 
parameter  estimation  in  a  decentralized  mode.  The  final  estimates  and  a  posteriori 
probabilities  from  each  sensor  are  processed  by  a  centralized  processor  to  derive  the 
optimum  estimate. 

The  method  provides  an  automatic  referencing  mechanism  of  the  data  from 
the  different  sensors  (automatic  data  alignment)  as  long  as  the  geometry  and  timing 
of  the  sweeping  beams  are  known.  For  optimal  target  resolution  performance,  it  is 
found  that  the  lines  of  sight  of  the  two  sensors  should  be  perpendicular  to  each 
other  at  any  given  time,  requiring  special  synchronization.  This  implies  that  if  the 
sweeping  angle  of  one  of  the  sensors,  e.q.  si  as  a  function  of  time  is  <f>i0(t),  the 
corresponding  sweeping  angle  of  sensor  82  must  be  fa 0(t)  =  x/2  +  fa0{t),  a  goal 
that  is  easily  accomplished  with  an  efficient  model  reference  (adaptive)  controller. 
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Figure  8.4  X  and  Y  Position  Errors  for  Single  and  Multiple  Sensors 
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Chapter  9 

Summary  and  Areas  for  Further  Study 

Two  high  order  filters  (HOFs)  have  been  presented  for  estimation  in  non- 
Gaussian  noise.  The  first  filter  is  designed  for  systems  with  asymmetric  probability 
densities.  The  asymmetrical  filter  is  developed  by  using  the  first  and  second  pow¬ 
ers  of  the  innovations  in  the  derivation  of  the  filter  equations.  The  second  filter  is 
designed  for  systems  with  symmetric  probability  densities.  It  is  developed  based  on 
first  and  third  powers  of  the  innovations.  These  filters  are  evaluated  experimentally 
in  non-Gaussian  noise  formed  from  Gaussian  sum  distributions.  Under  these  con¬ 
ditions  the  HOFs  perform  much  better  than  the  standard  Kalman  filter,  and  close 
to  the  optimal  Bayesian  estimator,  the  Gaussian  sum  filter.  However,  the  primary 
advantage  of  using  the  HOFs  occurs  either  when  the  noise  cannot  be  adequately 
represented  as  Gaussian  sums,  or  when  only  the  moments  of  the  noise  are  known, 
and  not  the  actual  density  functions.  Although  these  filters  are  more  complicated  to 
implement  than  the  standard  Kalman  filter,  they  are  not  nearly  as  computationally 
intensive  as  the  Gaussian  sum  filter  for  which  the  number  of  parallel  filters  grows 
geometrically  as  the  number  of  stages  increase. 

For  HOFs  designed  for  Ith  order  filter  moments,  their  implementation  re¬ 
quires  the  availability  of  prediction  error  moments  of  order  up  to  21.  In  general, 
when  I  >  1  it  is  necessary  to  either  truncate  the  expressions  for  the  filter  moments 
so  that  only  those  powers  of  prediction  and  measurement  error  moments  are  in¬ 
cluded  for  which  similar  powers  of  the  filter  moments  exist,  or  the  higher  powers  of 
the  prediction  and  measurement  error  moments  must  be  approximated.  This  leads 
to  either  the  truncation  of  the  filter  error  moment  expressions  or  the  estimation  of 
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prediction  error  moments  of  order  7  +  1  through  21.  It  is  shown  that  the  truncated 
filter  expressions  give  comparable  performance  to  those  with  estimated  higher  order 
moments. 

For  non-Gaussian  distributions  made  up  of  known  Gaussian  sums,  the  non- 
Gaussian  filters  presented  here  give  a  reasonable  compromise  between  the  optimal 
but  very  computationally  intensive  Gaussian  sum  filter,  and  the  suboptimal  but 
easily  implemented  standard  Kalman  filter.  In  addition,  when  only  the  moments 
of  the  distributions  are  known  and  a  Gaussian  sum  filter  cannot  be  used,  the  non- 
Gaussian  filters  offer  a  means  to  obtain  improved  performance  over  the  standard 
Kalman  filter.  One  method  to  improve  the  performance  of  the  non-Gaussian  filters 
is  to  use  higher  powers  of  the  innovations  in  developing  the  filter  equations.  However, 
the  resulting  filter  expressions  would  be  extremely  complicated  and  it  is  anticipated 
that  the  expected  performance  improvement  over  the  HOFs  presented  here  may  be 
marginal. 

A  more  general  filter  can  be  developed  by  including  the  first,  second,  and 
third  order  powers  of  the  innovations  in  developing  the  filter  equations.  This  can  be 
useful,  for  example,  in  a  situation  in  which  the  measurement  noise  has  an  asymmet¬ 
rical  distribution  and  the  process  noise  has  a  symmetrical  non-Gaussian  distribution. 
The  derivation  of  this  filter  will  follow  the  same  procedure  as  shown  in  Chapter  3. 
Three  separate  gain  matrices  will  be  required  in  this  case. 

From  the  implementation  standpoint  a  significant  reduction  in  the  computa¬ 
tional  burden  imposed  by  the  HOFs  can  be  accomplished  by  exploiting  the  redun¬ 
dancy  in  the  high  order  filter  moment  matrices.  For  example  the  error  covariance 
matrix  can  be  represented  by  either  the  upper  or  lower  triangular  matrix.  Similarly, 
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matrices  for  the  3rd  and  4th  order  expansions  also  contain  a  significant  amount 
of  redundancy,  and  efficient  algorithms  may  be  developed  for  including  only  the 
necessary  terms  in  these  matrices. 

This  thesis  also  addresses  several  signal  processing  estimation  problems  with 
a  model-based  formalism.  These  problems  are  all  treated  as  nonlinear  estimation 
problems  in  Gaussian  and  non-Gaussian  noise.  A  direct  model  is  used  in  which  the 
frequencies,  amplitudes,  damping  coefficients  and  phases  of  the  sinusoids  are  defined 
as  state  variables.  This  model  has  the  advantage  that  the  time  varying  behavior 
of  these  parameters  can  be  directly  described  through  the  process  equation.  The 
harmonic  retrieval  problem  is  solved  using  three  separate  nonlinear  filters  and  three 
iterated  forms  of  the  extended  Kalman  filter.  The  nonlinear  filters  offer  a  significant 
advantage  over  batch-type  estimators  in  that  time  varying  system  parameters  can 
be  modeled.  A  problem  that  has  been  studied  by  other  authors  is  addressed  and  it 
is  found  that  the  nonlinear  filters  offer  a  significant  advantage  over  other  techniques 
such  as  modified  singular  value  decomposition  and  cumulant-based  techniques  when¬ 
ever  the  initial  estimation  error  is  constrained.  It  is  shown  that  the  nonlinear  filters 
can  be  used  effectively  in  colored  Gaussian  noise  with  known  or  unknown  coeffi¬ 
cients,  and  in  measurement  noise  with  known  and  unknown  covariances.  Another 
advantage  of  the  nonlinear  filter  approach  is  that  these  filters  converge  relatively 
fast,  making  them  well-suited  for  short  data  lengths. 

A  joint  detection/estimation  (JD/E)  procedure  is  presented  and  applied  to 
problems  with  model  uncertainty  and/or  uncertain  initial  conditions.  The  imple¬ 
mentation  of  this  procedure  consists  of  several  filters  operating  in  parallel.  Each 
filter  hypothesizes  a  different  measurement  or  process  model,  different  initial  condi¬ 
tions,  or  both.  The  estimators  act  independently  of  the  detection  mechanism.  The 


226 


link  between  the  two  is  provided  by  the  a  posteriori  probability,  which  is  evaluated 
for  any  arbitrary  density  function.  These  estimators  can  include  any  recursive  filter 
such  as  the  linear  Kalman  filter,  nonlinear  filters,  or  the  HOFs. 

The  JD/E  approach  is  applied  to  model  order  selection.  A  general  approach 
is  presented  for  determining  the  number  of  sinusoids  present  in  measurements  cor¬ 
rupted  by  additive  white  Gaussian  and  non-Gaussian  noise.  The  approach  involves 
the  simultaneous  application  of  maximum  a  posteriori  (MAP)  detection  and  nonlin¬ 
ear  estimation  of  the  state  variables,  which  consist  of  the  amplitudes  and  frequencies 
of  sinusoids  in  each  model.  Estimation  is  performed  using  the  extended  Kalman  fil¬ 
ter  when  the  noise  is  Gaussian,  and  the  extended  high  order  filter  (EHOF)  when  the 
noise  is  in  non-Gaussian.  The  initial  state  estimates  are  constrained  to  be  within  an 
initial  variance.  The  problem  is  formulated  as  a  multiple  hypothesis  testing  problem 
with  assumed  known  a  priori  probabilities  for  each  hypothesis.  Each  hypothesis  rep¬ 
resents  a  different  model.  Experimental  evaluation  of  this  approach  demonstrates 
excellent  performance  for  model  order  selection  and  system  parameter  estimation  in 
both  Gaussian  and  non-Gaussian  noise. 

The  JD/E  approach  for  problems  with  uncertain  initial  conditions  is  applied 
to  the  estimation  of  the  time  delay  and  Doppler  shift  from  the  active  echo  returns 
of  monostatic  sensor(s).  The  problem  becomes  one  of  localizing  a  target  in  range- 
Doppler  space.  The  range-Doppler  space  is  partitioned  into  a  number  of  resolution 
cells.  Each  cell  is  identified  with  a  hypothesis  that  the  signal  is  present  in  it.  The 
joint  detection/estimation  scheme  is  then  used  to  localize  the  target  and  refine  its 
parameter  estimates  (i.e.  time  delay  and  Doppler  shift).  The  measurements  that 
are  used  to  localize  the  target  consist  of  signal  returns  corrupted  by  additive  white 
Gaussian  and  non-Gaussian  noise.  It  is  found  that  the  initial  estimation  error  for 
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time  delay  must  be  within  1  /2  fc  for  any  given  estimator  to  form  an  accurate  estimate 
of  target  position.  This  requirement  can  be  relaxed  with  the  JD/E  scheme,  since  a 
very  large  initial  estimation  error  can  be  segmented  into  a  number  of  filters,  each 
with  a  much  smaller  error.  The  MAP  estimate  gives  very  good  results  under  these 
conditions. 

The  JD/E  approach  for  the  combination  of  model  uncertainty  and  uncertain 
initial  conditions  is  applied  to  the  problem  of  data  fusion  from  two  cooperating,  non- 
collocated  sensors  that  are  attempting  to  detect  a  target  and  estimate  its  position. 
The  geometric  areas  of  coverage  of  the  two  sensors  partially  overlap.  Thus,  the 
model  is  general  enough  to  include  sensor  misalignment.  In  the  overlap  region  the 
estimates  from  the  two  sensors  are  combined  to  produce  improved  estimates  over 
the  single  sensor  estimates. 

Several  hypotheses  are  postulated  for  detection.  Each  hypothesis  corresponds 
to  the  ability  of  each  sensor  to  detect  the  target  in  its  area  of  coverage.  The  a 
priori  probabilities  of  each  decision  is  based  on  the  area  of  coverage  of  the  two 
sensors.  For  each  hypothesis,  a  high  order  filter  recursively  estimates  time  delay, 
Doppler  shift  and  geometric  angle  to  the  target  from  processing  the  returns  of  the 
transmitted  signal  from  each  sensor.  These  estimates  are  in  turn  used  to  estimate 
target  position  and  velocity.  For  each  of  these  hypotheses,  another  set  of  parallel 
filters  is  used  to  obtain  more  accurate  estimates  of  signal  parameters  and  to  account 
for  the  stability  problems  that  result  from  the  first  order  Taylor  series  expansion 
used  in  the  nonlinear  filtering  algorithms.  This  is  accomplished  by  operating  a 
separate  filter  for  each  of  several  different  initial  time  delay  estimates  of  the  return 
signal.  The  maximum  likelihood  estimate  for  a  given  hypothesis  is  then  determined 
as  a  weighted  sum  of  the  estimates  from  each  of  the  local  hypotheses,  with  the  a 
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posteriori  probability  being  used  as  the  weighting  function.  It  is  assumed  that  the 
signals  are  imbedded  in  Gaussian  noise  and  clutter.  The  clutter  is  treated  as  non- 
Gaussian  noise  with  a  lognormal  or  Weibull  distribution.  Excellent  performance  is 
obtained  using  the  JD/E  approach  with  high  detection  probability  and  very  good 
target  position  estimates. 

The  restriction  of  small  initial  estimation  error,  made  for  the  harmonic  re¬ 
trieval  and  model  order  selection  problems  can  be  relaxed,  if  the  JD/E  approach 
is  used  for  estimation  (the  model  with  uncertain  initial  conditions).  The  model  or¬ 
der  selection  initial  conditions  can  also  be  relaxed  if  the  JD/E  approach  for  model 
uncertainty  and  uncertain  initial  conditions  is  used. 

Since  the  estimation  for  each  of  the  hypotheses  in  the  JD/E  approach  is 
performed  independently,  this  scheme  is  a  natural  application  for  parallel  processing. 
The  model  selection  or  detection  decision  can  be  made  by  a  centralized  processor 
after  all  of  the  data  is  processed.  Thus,  the  JD/E  approach  is  very  well  suited  for 
real-time  implementation  using  advanced  massively  parallel  computer  architectures. 
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Appendix 


Cramer-Rao  Bound  for  the  Harmonic  Retrieval  Problem 

This  appendix  presents  the  derivation  of  the  Cramer-Rao  bound  for  P  ex¬ 
ponentially  damped  sinusoids  in  white  gaussian  noise.  Consider  the  measurement 
model  given  by  the  formula 

P 

zk  =  2  cb  exP  (~abk  +  j(ubk  +  %))  +  .  .  , 

p=i  (A.l) 

=  hk(*k)  +  Vk 

for  k  =  0, 1,  •  •  • ,  K  —  1.  Vk  is  assumed  to  be  complex  white  Gaussian  noise 
with  mutually  independent  real  and  imaginary  components  each  with  variance  a2. 
The  elements  of  the  state  variable  vector  x*  are  defined  as 

x*4(i»-i)+i  =  UkP 
Xk4(p-l)+2  =  Cb 

(A.  2) 

X*4(p-l)+3  “  6b 
Z*4(|»-l)+4  =  °V 

The  objective  is  to  estimate  some  or  all  of  the  4  P  possibly  time- varying 
parameters  in  this  system  based  on  the  measurements.  The  probability  density 
function  of  the  set  of  measurments  z  =  [20,  zl,  zK-l]T  conditioned  on  the 
unknown  parameters  x*  is  given  by 
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p(z|x*)  =  (2tt<7)  k  exp. 


_i  K- 1 

oEN-  Mx*)|2 

*=0 


(A.3) 


The  Cramer-Rao  bound  [73]  (pp.  66,  84)  gives  the  minimum  possible  vari¬ 
ance  of  any  unbiased  estimate  x*(z)  of  the  state  x*.  In  the  presence  of  no  prior 
information  about  the  state  this  bound  is  given  by 


Var[xjt(z)  -  xt]  =  J 51 
where  Jj)  is  the  Fisher  information  matrix  given  by 


JD  =  -E 


d2  lnp(z|xjfc) 
dxkdxf 


Applying  this  to  (A.3)  yields 


(A.4) 


*  =  2^E  e  ~  **  hk(*kY  -  4  Mxk)  + 


^(xfc)  Mx*nj 


which  reduces  to 


V1  F  /  g**(x*)  dhk(xk)* 
t=o  l  d*k  dxl 


+ 


dhk(xky  dhk(xk)] 
dxk  dxf  f 


(A.  5) 


can  be  evaluated  by  finding  expres- 


P 


Mxk)  =  £  hk,(*k) 


where  the  measurement  component  hkp(x t)  represents  the  contribution  from  the  pth 
sinusoid  and  is  given  by 


hkp(x-k)  =  Ckp  exp  (-atkpk  +  j(bjkpk  +  6kp)) 

The  two  partial  derivative  vectors  can  then  be  expressed  as 

jk 

j 

—k 
‘  -jk 

0Mx*T  =  f  l!cb 

d*k  -j 

.  ~k 

A.l  Case  1  :  Estimation  of  two  frequencies  and  Damping  Coefficients 

For  estimation  of  the  parameters  of  two  exponentially  damped  sinusoids  the 
measurement  equation  becomes 


P 

*k  =  £  hkP(xk)  +  vk 

P=i 

where 


hkpM  =  exp  (-ctkpk  +  jukpk ) 


(A.  7) 
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The  elements  of  the  state  variable  vector  x*  are  defined  as 

X\ fcj  =  w  i 
*k2  -  Ql 

(AS) 

X*3  =  W2 
Xi4  =  a2. 

and  the  partial  derivatives  in  (A.5)  become 


’  jhk^Xk) ' 

-jhkl(xk)*' 

“Mx*) 

d-^-k 

-hkl(xky 

jhk2M 

-j  hk2(xky 

-kk2(*k). 

.-kk2(xky . 

The  Fisher  information  matrix  for  this  system  becomes 


(A.9) 


a  0  c  0 
0  a  0  c 
c  0  b  0 
0  c  0  b 

where 

K- 1 

a=  J2  k2e-2alk 

k= 0 
tf-1 

6=  £  Jb2e-2a 2* 
t=o 

Jf-1 

c  =  £  it2  c“(°rl+0r2) *cos((u?i  -  u^)  *) 
k=0 


Inverting  this  analytically  the  CR  bound  becomes 

a!  0  d  0 
0  a'  0  o' 

Var[xt(z)  -  xt]  =  <r2 

d  0  b'  0 
.  0  d  0  6' 

where 

,  a  ft2  -  6c2 
°  ~  a2b 2  -  2ab<? 

,  _  a2  6  —  a  c2 
a2^2  —  2  a  6c2 

,  _  c(ab  -  c2  ) 
a2^  —  2  abc2 

From  this  the  following  observations  are  made: 

1  For  a  given  sinusoid,  the  CR  bound  for  the  frequency  estimate  is  the  same 
as  that  for  the  damping  coefficient  estimate. 

2  A  higher  damping  coefficient  gives  a  larger  CR  bound. 

3  The  bound  is  dependent  on  the  difference  between  the  two  frequencies  and 
not  their  individual  values. 

A.2  Case  2  :  CR  Bound  with  a  priori  Information 

In  the  case  where  the  statistics  of  the  initial  estimation  error  are  known  and 
are  given  by  the  initial  covariance  Pq  the  CR  bound  (A. 4)  becomes 

Var[xfc(z)  -  xt]  =  \JD  +  *]-*.  (All) 


where  Jp  =  l/Po- 
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If  Jjj  is  small  relative  to  Jp  then  the  bound  will  be  controlled  by  the  initial 
covariance.  From  equation  (A.5)  it  can  be  seen  that  as  the  noise  variance  <r2  increases 
Jd  decreases.  However,  even  though  the  increased  noise  increases  the  CR  bound, 
there  will  be  a  point  when  the  Jp  dominates  implying  that  the  optimal  unbiased 
estimate  is  always  at  least  as  good  as  the  initial  estimate.  Thus  Jp  sets  an  upper 
limit  on  the  variance  in  the  estimate. 


